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Abstract 

We present a new approach to automated reasoning about higher-order programs by endowing sym¬ 
bolic execution with a notion of higher-order, symbolic values. 

To validate our approach, we use it to develop and evaluate a system for verifying and refuting 
behavioral software contracts of components in a functional language, which we call soft contract 
verification. In doing so, we discover a mutually beneficial relation between behavioral contracts 
and higher-order symbolic execution. Contracts aid symbolic execution by providing a rich language 
of specifications that can serve as the basis of symbolic higher-order values; the theory of blame 
enables modular verification and leads to the theorem that verified components can’t be blamed', 
and the run-time monitoring of contracts enables soft verification whereby verified and unverified 
components can safely interact and verification is not an all-or-nothing proposition. Conversely, 
symbolic execution aids contracts by providing compile-time verification which increases assurance 
and enables optimizations; automated test-case generation for contracts with counter-examples; and 
engendering a virtuous cycle between verification and the gradual spread of contracts. 

Our system uses higher-order symbolic execution, leveraging contracts as a source of symbolic 
values including unknown behavioral values, and employs an updatable heap of contract invariants to 
reason about flow-sensitive facts. Whenever a contract is refuted, it reports a concrete counterexample 
reproducing the error, which may involve solving for an unknown function. The approach is able to 
analyze first-class contracts, recursive data structures, unknown functions, and control-flow-sensitive 
refinements of values, which are all idiomatic in dynamic languages. It makes effective use of an off- 
the-shelf solver to decide problems without heavy encodings. Our counterexample search is sound 
and relatively complete with respect to a first-order solver for base type values. Therefore, it can form 
the basis of automated verification and bug-finding tools for higher-order programs. The approach is 
competitive with a wide range of existing tools—including type systems, flow analyzers, and model 
checkers—on their own benchmarks. We have built a tool which analyzes programs written in Racket, 
and report on its effectiveness in verifying and refuting contracts. 


Prior publications 

This paper unifies and expands upon the work presented in the papers “Soft contract verifi¬ 
cation,’’ in Proceedings of the 19th A CM SIGPLAN International Conference on Functional 



ZU064-05-FPR paper-jfp 20 March 2016 10:4 


2 

Programming (Nguyen et al. 2014) and “Relatively complete counterexamples for higher- 
order programs,” in Proceedings of the 36th ACM SIGPLAN Conference on Programming 
Language Design and Implementation (Nguyen and Van Horn 2015). It also subsumes the 
work in the paper “Higher-order symbolic execution via contracts,” in Proceedings of the 
ACM International Conference on Object Oriented Programming Systems Languages and 
Applications (Tobin-Hochstadt and Van Horn 2012). 


1 Static verification for dynamic languages 

Contracts (Meyer 1991; Findler and Felleisen 2002) have become a prominent mechanism 
for specifying and enforcing invariants in dynamic languages (Disney 2013; Plosch 1997; 
Austin et al. 2011; Strickland et al. 2012; Hickey et al. 2013). They offer the expressivity 
and flexibility of programming in a dynamic language, while still giving strong guaran¬ 
tees about the interaction of components. However, there are two downsides: (1) contract 
monitoring is expensive, often prohibitively so, which causes programmers to write more 
lax specifications, compromising correctness for efficiency; and (2) contract violations 
are found only at run-time, which delays discovery of faulty components with the usual 
negative engineering consequences. 

Static verification of contracts would empower programmers to state stronger properties, 
get immediate feedback on the correctness of their software, and avoid worries about run¬ 
time enforcement cost since, once verified, contracts could be removed. All-or-nothing 
approaches to verification of typed functional programs has seen significant advances in 
the recent work on static contract checking (Xu et al. 2009; Xu 2012; Vytiniotis et al. 2013), 
refinement type checking (Terauchi 2010; Zhu and Jagannathan 2013; Vazou et al. 2013, 
2014), and model checking (Kobayashi 2009b; Kobayashi et al. 2010, 2011). However, the 
highly dynamic nature of untyped languages makes verification more difficult. 

Programs in dynamic languages are often written in idioms that thwart even simple 
verification methods such as type inference. Moreover, contracts themselves are written 
within the host language in the same idiomatic style. This suggests that moving beyond 
all-or-nothing approaches to verification is necessary. 

Fortunately, contracts themselves give us the tools to enable these new approaches, by 
describing values and by partitioning programs on boundaries. We dub our approach soft 
contract verification , enabling piecemeal and modular verification of contracts. This ap¬ 
proach augments a standard reduction semantics for a functional language with contracts 
and modules by endowing it with a notion of “unknown” values refined by sets of contracts. 
Verification is carried out by executing programs on abstract values. 

Two crucial ideas from contracts allow us to go from whole-program, first-order ap¬ 
proaches to modular, higher-order contract verification. 

• First, contracts as abstract values provide a language of specifications that scales 
to higher-order values and can encompass arbitrary specifications. This means that 
whatever guarantees a client needs, they can be specified in the interface and handled 
by our approach. 

• Second, blame to partition programs makes modular analysis possible. In a higher- 
order system, behavioral values can flow across module boundaries. Determining 
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what a modular analysis means in this setting is tricky, but again contracts provide 
the answer. By re-using the concept of blame from Findler and Felleisen (2002), 
we define the errors that we rule out as exactly those that blame the portion of 
the program under consideration. This crucial distinction will become especially 
important when considering the behavior of unknown higher-order values. 

To demonstrate the first step in applying our approach, consider the following contrived, 
but illustrative example. Let positive? and negative? be predicates for positive and neg¬ 
ative integers. Contracts can be arbitrary predicates, so these functions are also contracts. 
Consider the following contracted function (written in Racket (Flatt and PLT 2010)): 

(define/contract (f x) 

(positive? . -> . negative?) ; contract 
(* x -1)) 

We can verify this program by (symbolically) running it on an “unknown” input. Checking 
the domain contract refines the input to be an unknown satisfying the set of contracts 
{positive?}. By embedding some basic facts about positive?, negative?, and -1 into 
the reduction relation for *, we conclude (* {positive?} -1) \—> {negative?}, and 
voila, we’ve shown once and for all f meets its contract obligations and cannot be blamed. 
We could therefore soundly eliminate any contract which blames f, in this case negative?. 
At its core, we rely on a simple idea: symbolic execution naturally breaks down programs 
into simpler components, enabling effective reasoning about seemingly-complex features. 

This simple approach, building on the two lessons of contracts we have described, is 
effective for small examples, but insufficient to scale to realistic programs. In this paper, 
we show how the initial approach can handle tricky problems and larger programs by 
incorporating several additional techniques. 

Solver-aided reasoning: While embedding symbolic arithmetic knowledge for specific, 
known contracts works for simple examples, it fails to reason about arithmetic generally. 
Contracts often fail to verify because equivalent formulations of contracts are not hard¬ 
coded in the semantics of primitives. Many systems address this issue by incorporating 
an SMT solver. However, for a higher-order language, solver integration is often achieved 
by reasoning in a theory of uninterpreted functions or semantic embeddings (Knowles and 
Flanagan 2010; Rondon et al. 2008; Vytiniotis et al. 2013). 

In this paper, we observe that higher-order contracts can be effectively verified using 
only a simple first-order solver. The key insight is that contracts delay higher-order checks 
and failures always occur with a first order witness. By relying on a (symbolic) semantic 
approach to carry out higher-order contract monitoring, we can use an SMT solver to reason 
about integers without the need for sophisticated encodings. (Examples in §2.3.) 

Flow sensitive reasoning: Just as our semantic approach decomposes higher-order con¬ 
tracts into first-order properties, first-order contracts naturally decompose into conditionals. 
If the verification procedure did not take this into account, even simple examples would fail 
to verify: 


(g : integer? -> negative?) 
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(define (g x) (if (positive? x) (f x) (f 8))) 

This is because the true-branch call to f is (f {integer?}) by substitution, although we 
know from the guard that x satisfies positive?. 

In this paper, we observe that flow-sensitivity can be achieved by replacing substitution 
with heap-allocated values. These heap addresses are then refined as they flow through 
predicates and primitive operations, with no need for special handling of contracts (§2.2). 
As a result, the system is not only effective for contract verification, but can also handle 
safety verification for programs with no contracts at all. 

First-class contracts: Pragmatic contract systems enable first-class contracts so new com- 
binators can be written as functions that consume and produce contracts. But to the best of 
our knowledge, no verification system currently supports first class contracts (or refine¬ 
ments), and in most approaches it appears fundamentally difficult to incorporate such a 
notion. 

Because we handle contracts (and all other features) by execution , first-class contracts 
pose no significant technical challenge and our system reasons about them effectively 
(§ 2 . 6 ). 

Refuting contracts with concrete counterexamples Generating inputs that crash first- 
order programs is a well-studied problem in the literature on symbolic execution (Cadar 
et al. 2006; Godefroid et al. 2005), type systems (Foster et al. 2002), flow analysis (Xie and 
Aiken 2005), and software model checking (Yang, Twohey, Engler, and Musuvathi Yang 
et al.). However, in the setting of higher-order languages, those that treat computations 
as first-class values, research has largely focused on the verification of programs without 
investigating how to effectively report counterexamples as concrete inputs when verifica¬ 
tion fails (e.g., Rondon et al. (2008); Xu et al. (2009); Kawaguchi et al. (2010); Vytiniotis 
et al. (2013); Tobin-Hochstadt and Van Horn (2012)), or restricted unknown inputs to 
first-order (e.g., Kobayashi et al. (2011)). Searching for a counterexample witnessing each 
program bug seems futile in the presence of higher-order unknown inputs: after all, the 
space of possibilities is huge, and most SMT solvers do not produce models for higher- 
order unknown values. 

Nevertheless, we recognize that even though there are numerous higher-order inputs, 
they trigger program errors in their contexts following only a few specific patterns. There¬ 
fore, instead of searching through the space of all possible functions for a counterexample, 
we only consider a small subset of functions of specific shapes. The remarkable result is 
that this method enjoys strong guarantees: each counterexample triggers a real contract 
violation ( soundness ), and given an SMT solver that is complete for base data types, our 
method constructs a counterexample reproducing each possible contract violation (relative 
completeness). 

Converging for complex recursion: Of course, simply executing programs has a funda¬ 
mental drawback—it will fail to terminate in many cases, and when the inputs are unknown, 
execution will almost always diverge. Simply detecting cycles in the state space handles 
straightforward tail-recursive functions, but not more complex recursive calls. Without a 
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solution to this problem, even simple programs operating over inductive data would be 
impossible to verify. 

In this paper, we accelerate the convergence of programs by identifying and approxi¬ 
mating regular accumulation of evaluation contexts, causing common recursive programs 
to converge on unknown values, while providing precise predictions (§2.5). As with the 
rest of our approach, this happens during execution and is therefore robust to complex, 
higher-order control flow. 


Combining these techniques yields a system competitive with a diverse range of existing 
powerful static checkers, achieving many of their strengths in concert, while balancing the 
benefits of static contract verification with the flexibility of dynamic enforcement. 

We have built a prototype soft verification engine, which we dub SCV, based on these 
ideas and used it to evaluate the approach (§4). Our evaluation demonstrates that the ap¬ 
proach can verify properties typically reserved for approaches that rely on an underlying 
type system, while simultaneously accommodating the dynamism and idioms of untyped 
programming languages. We take examples from work on soft typing (Cartwright and 
Fagan 1991; Wright and Cartwright 1997), type systems for untyped languages (Tobin- 
Hochstadt and Felleisen 2010), static contract checking (Xu et al. 2009; Xu 2012), re¬ 
finement type checking (Terauchi 2010), and model checking of higher-order functional 
languages (Kobayashi 2009b; Kobayashi et al. 2010, 2011). 

SCV can prove all contract checks redundant for almost all of the examples taken from 
this broad array of existing program analysis and type checking work, and can handle many 
of the tricky higher-order verification problems demonstrated by other systems. In other 
words, our approach is competitive with type systems, model checkers, and soft typing 
systems on each of their chosen benchmarks—in contrast, work on higher-order model 
checking does not handle benchmarks aimed at soft typing or occurrence typing, and vice 
versa. In the cases where SCV does not prove the complete absence of contract errors, 
the vast majority of possible dynamic errors are ruled out, justifying numerous potential 
optimizations. Over this corpus of programs, 99% of the contract and run-time type checks 
are proved safe, and could be eliminated. 

We also evaluate the verification of three small interactive video games which use first- 
class and dependent contracts pervasively. The results show the subsequent elimination of 
contract monitoring has a dramatic effect: from a speed up factor of 7 in one case, to three 
orders of magnitude in the others. In essence, these results show the games are infeasible 
without contract verification. 


2 Worked examples 

We now present the main ideas of our approach through a series of examples taken from 
work on other verification techniques, starting from the simplest and working up to a 
complex object encoding. 
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2.1 Higher-order symbolic reasoning 

Consider the following simple function that transforms functions on even integers into 
functions on odd integers. It has been ascribed this specification as a contract, which can 
be monitored at run-time. 

(e2o : (even? -> even?) -» (odd? -> odd?)) 

(define (e2o f) 

(A (n) (- (f (+ n 1)) 1))) 

A contract monitors the flow of values between components. In this case, the contract 
monitors the interaction between the context and the e2o function. It is easy to confirm 
that e2o is correct with respect to the contract; e2o holds up its end of the agreement, and 
therefore cannot be blamed for any run-time failures that may arise. The informal reasoning 
goes like this: First assume f is an even? -> even? function. When applied, we must ensure 
the argument is even (otherwise e2o is at fault), but may assume the result is even (otherwise 
the context is at fault). Next assume n is odd (otherwise the context is at fault) and ensure the 
result is odd (otherwise e2o is at fault). Since (+ n 1) is even when n is odd, f is applied to 
an even argument, producing an even result. Subtracting one therefore gives an odd result, 
as desired. 

This kind of reasoning mimics the step-by-step computation of e2o, but rather than 
considering some particular inputs, it considers these inputs symbolically to verify all pos¬ 
sible executions of e2o. We systematize this kind of reasoning by augmenting a standard 
reduction semantics for contracts with symbolic values that are refined by sets of contracts. 
At first approximation, the semantics includes reductions such as: 

(+ {odd?} 1) i —> {even?}, and 
({even? -* even?} {even?}):—^ {even?}. 

This kind of symbolic reasoning mimics a programmer’s informal intuitions which em¬ 
ploy contracts to refine unknown values and to verify components meet their specifications. 
If a component cannot be blamed in the symbolic semantics, we can safely conclude it 
cannot be blamed in general. 


2.2 Flow sensitive reasoning 

Programmers using untyped languages often use a mixture of type-based and flow-based 
reasoning to design programs. The analysis naturally takes advantage of type tests idiomatic 
in dynamic languages even when the tests are buried in complex expressions. The following 
function taken from work on occurrence typing (Tobin-Hochstadt and Felleisen 2010) can 
be proven safe using our symbolic semantics: 

(f : (or/c int? str?) cons? - int?) 

(define (f x p) 

(cond 

[(and (int? x) (int? (car p))) (+ x (car p))] 

[(int? (car p)) (+ (str-len x) (car p))] 

[else 0])) 
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Here, int?, str?, and cons? are type predicates for integers, strings, and pairs, respec¬ 
tively. The contract (or/c int? str?) uses the or/c contract combinator to construct a 
contract specifying a value is either an integer or a string. 

A programmer would convince themselves this program was safe by using the control 
dominating predicates to refine the types of x and (car p) in each branch of the con¬ 
ditional. 1 Our symbolic semantics accommodates exactly this kind of reasoning in order 
to verify this example. However, there is a technical challenge here. A straightforward 
substitution-based semantics would not reflect the flow-sensitive facts. Focusing just on 
the first clause, a substitution model would give: 

(cond 

[(and (int? {(or/c int? str?)}) (int? (car {cons?}))) 

(+ {(or/c int? str?)} (car {cons?}))] ...) 

At this point, it’s too late to communicate the refinement of these sets implied by the test 
evaluating to true, so the semantics would report the contract on + potentially being violated 
because the first argument may be a string, and the second argument may be anything. We 
overcome this challenge by modelling symbolic values as heap-allocated sets of contracts. 
When predicates and data structure accessors are applied to heap addresses, we refine the 
corresponding sets to reflect what must be true. So the program is modelled as: 

(cond 

[(and (int? Li) (int? (car L 2 ))) 

(+ L\ (car In)) ] ■■■) 

where L\ ^{(or/c int? string?)}, L 2 i-t {cons?} 

In the course of evaluating the test, we get to (int? L\ ), the semantics conceptually forks 
the evaluator and refines the heap: 

(int? L \) 1 — > true, where L\ 1 — ► {int?} 

1 —s- false, where L\ i-t {string?} 

Similar refinements to L 2 are communicated through the heap for (int? (car L 2 )), there¬ 
by making (+ Li (car Li)) safe. This simple idea is effective in achieving flow-based 
refinements. It naturally handles deeply nested and inter-procedural conditionals. 

2.3 Incorporating an SMTsolver 

The techniques described so far are highly effective for reasoning about functions and many 
kinds of recursive data structures. However, effective reasoning about many kinds of base 
values, such as integers, requires sophisticated domain-specific knowledge. Rather than 
build such a tool ourselves, we defer to existing high-quality solvers for these domains. 
Unlike many solver-aided verification tools, however, we use the solver only for queries 
on base values, rather than attempting to encode a rich, higher-order language into one that 

1 The call to str-len is safe because (and (int? x) (int? (car p))) being false and (int? 
(car p)) being true implies that (int? x) is false, which in turns implies x is a string as enforced 
by f’s contract. 
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is accepted by the solver. This obviates the need of a general (and error-prone) translation 
of the language. For example, there is no need to embed an untyped language’s “unityped 
type system” into the solver’s type system. 

To demonstrate our approach, we take an example (int ro3) from work on model check¬ 
ing higher-order programs (Kobayashi et al. 2011). 

; (>/c n) abbreviates (A (x) (> x n)) 

(define (f x g) (g (+ x 1))) 

(h : [x : int?] - [y : (and/c int? (>/c x) )] - (and/c int? (>/c y))) 
(define (h x) ...) ; unknown definition 

(main : int? -* (and/c int? (>/c 0))) 

(define (main n) (if (> n 0) (f n (h n)) 1)) 

In this program, we define a contract combinator (>/c) that creates a check for an integer 
from a lower bound; a helper function f, which comes without a contract; and an unknown 
function h that given an integer x, returns a function mapping some number y that is 
greater than x to an answer greater than y—here h’s specification is given, but not its 
implementation. (Note h’s contract is dependent.) We verify main’s correctness, which 
means it definitely returns a positive integer and does not violate h’s contract. 

According to its contract, main is passed an integer n. If n is negative, main returns 1, 
satisfying the contract. Otherwise the function applies f to n and (h n). Function h, by 
its contract, returns another function that requires a number greater than n. Examining f ’s 
definition, we see h (now bound to g) is eventually applied to (+ n 1 ). Let n i be the result 
of (+ n 1). And by h’s contract, we know the answer to (h n) is another integer greater 
than ni. Let us name this answer ri 2 . In order to verify that main satisfies contract (>/c 0), 
we need to verify that no is a positive integer. 

Once f returns, the heap contains several addresses with contracts: 


n 

i—^ 

{int?, (>/c 

0)} 

ni 


U 

\ 

II 

■H 

C 

■H 

(+ n 1))} 

n 2 


{int?, (>/c 

ni)} 


We then translate this information to a query for an external solver: 

n, ni, n 2 : INT; 

ASSERT n > 0; 

ASSERT ni = n + 1; 

ASSERT n 2 > nj; 

QUERY n 2 > 0; 

Solvers such as CVC4 (Barrett et al. 2011) and Z3 (Moura and Bjorner 2008) easily verify 
this implication, proving main’s correctness. 

Refinements such as (>/c 0) are generated by primitive applications (> x 0),andque- 
ries are generated from translation of the heap, not arbitrary expressions. This has a few 
consequences. First, by the time we have value v satisfying predicate p on the heap, we 
know that p terminates successfully on v. Issues such as errors (from p itself) or divergence 
are handled elsewhere in other evaluation branches. Second, we only need to translate 
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a small set of simple, well understood contracts—not arbitrary expressions. Evaluation 
naturally breaks down complex expressions, and properties are discovered even when they 
are buried in complex, higher-order functions. Given a translation for (>/ c 0), the analysis 
automatically takes advantage of the solver even when the predicate contains > in a complex 
way, such as (A (x) (or (> x 0) E) where E is an arbitrary expression. Predicates that 
lack translations to SMT only reduce precision, never soundness. 


2.4 Generating higher-order counterexamples 

Programmers benefit not only from verification of correct programs, but also refutation 
of incorrect programs through concrete counterexamples: they minimize the confusion 
between a true bug and a false warning, and provide programmers with insight into their 
code’s defects. 

In the following program, f’s contract promises that if its argument is a function return¬ 
ing an integer, then f returns an integer. In its body, f performs a division involving the 
application of its argument to 42. 

(f : (int? -> int?) - int?) 

(define (f g) 

(/ 1 (- 100 (g 42)))) 

Function f’s definition is unsafe in two ways. First, the division is not protected against 
a denumerator of 0. Second, / potentially returns a quotient, causing f to violate the int? 
contract in its range. 

In this case, the only way function g interacts with the code under verification (function 
f ) is through its returned value. Because g is applied only to 42 in this case, it suffices to 
search for instantiations of g in the space of constant functions of the form (A (_) n) with 
n being an unknown integer. The system therefore can produce two counterexaples trigger 
two potential bugs in the program: 

Contract violation: f violates contract with / 

Value 0 violates contract (not/c (=/c 0)) 

An example that triggers this violation: 

(f (A (n) 100)) 

Contract violation: f violates its own contract 
Value -1/2 violates contract int? 

An example that triggers this violation: 

(f (A (n) 102)) 

In more complex programs, an unknown function can interact and trigger errors in mul¬ 
tiple ways: either by returning another value to be consumed by the context, or by applying 
a function coming from the context to some values. The counterexample can be more 
complex, but in each case, it is only how the unknown function interacts with its context that 
is relevant to producing a counterexample. The system needs not consider instantiations to 
unknown functions that perform irrelevant work, diverge, or have their own errors. 
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2.5 Converging for non-tail recursion 

The techniques sketched above provide high precision in the examples considered, but 
simply executing programs on abstract values is unlikely to terminate in the presence of 
recursion. When an abstract value stands for an infinite set of concrete values, execution 
may unfold infinitely, building up an ever-growing evaluation context. To tackle this prob¬ 
lem, we summarize this context to coalesce repeated structures and enable termination on 
many recursive programs. Although guaranteed termination is not our goal, the empirical 
results (§4) demonstrate that the method is effective in practice. 

The following example program is taken from work on model checking of higher-order 
functional programs (Kobayashi et al. 2011), and demonstrates checking non-trivial safety 
properties on recursive functions. Note that no loop invariants need be provided by the user. 

(main : (and/c int? (>=/c 0)) -> (and/c int? (>=/c 0))) 

(define (main n) 

(let ([l (make-list n)]) 

(if (> n 0) (car (reverse l empty)) 0))) 

(define (reverse l ac) 

(if (empty? 1) ac 

(reverse (cdr l) (cons (car l) ac)))) 

(define (make-list n) 

(if (= n 0) empty 

(cons n (make-list (- n 1)))))) 

Again, we aim to verify both the specified contract for main as well as the preconditions 
for primitive operations such as car. Most significantly, we need to verify that ( reverse 
l empty) produces a non-empty list (so that car succeeds) and that its first element is a 
positive integer. The local functions reverse and make-list do not come with a contract. 

This problem is more challenging than the original OCaml version of the same program, 
due to the lack of types. This program represents a common idiom in dynamic languages: 
not all values are contracted, and there is no type system on which to piggy-back verifi¬ 
cation. In addition, programmers often rely on inter-procedural reasoning to justify their 
code’s correctness, as here with reverse. 

We verify main by applying it to an abstract (unknown) value nj. The contract ensures 
that within the body, ni is a non-negative integer. 

The integer ni is first passed to make-list. The comparison (= ni 0) non-determin- 
istically returns true or false, updating the information known about ni to be either 0 
or (>/c 0) in each corresponding case. In the first case, make-list returns empty. In the 
second case, make-list proceeds to the recursive application (make-list no), where no 
is the abstract non-negative integer obtained from evaluating (- ni 1). However, (make- 
l i s t no) is identical to the original call ( ma ke -1 i s t n i) up to renaming, since both n i and 
no are non-negative. Therefore, we pause here and use a summary of make-list’s result 
instead of continuing in an infinite loop. 

Since we already know that empty is one possible result of (make-list ni ), we use it 
as the result of (make-list n 2 )- The application (make-list ni) therefore produces the 
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pair (ni, empty), which is another answer for the original application. We could continue 
this process and plug this new result into the pending application ( make- list n? ). But we 
instead first approximate (ni, empty) to a non-empty list of positive integers. This approx¬ 
imation choice is guided by the observation that plugging in empty in the recursive call 
gives rise to (ni, empty). We then use this approximate answer as the result of the pending 
application (make-list ni ). This then induces another result for (make-list ni ), a list 
of two or more positive integers, but this is subsumed by the previous answer of non-empty 
integer list. We have now discovered all possible return values of make-list when applied 
to a non-negative integer: it maps 0 to empty, and positive integers to a non-empty list of 
positive integers. 

Although our explanation made use of the order, the soundness of analyzing make- 
list does not depend on the order of exploring non-deterministic branches. Each recursive 
application with repeated arguments generates a waiting context, and each function return 
generates a new case to resume. There is an implicit work-list algorithm in the modified 
semantics (§3.8.2). 

When make-list returns to main, we have two separate cases: either ni is 0 and l is 
empty, or ni is positive and l is non-empty. In the first case, (> ni 0) is false and main 
returns 0, satisfying the contract. Otherwise, main proceeds to reversing the list before 
taking its first element. 

Using the same mechanism as with make-list, the analysis infers that reverse returns 
a non-empty list when either of its arguments (l or acc) is non-empty. In addition, reverse 
only receives arguments of proper lists, so all partial operations on l such as car and cdr 
are safe when l is not empty, without needing an explicit check. The function eventually 
returns a non-empty list of integers to main, justifying main’s call to the partial function 
car, producing a positive integer. Thus, main never has a run-time error in any context. 

While this analysis makes use of the implementation of make-list and reverse, that 
does not imply that it is whole-program. Instead, it is modular in its use of unknown 
values abstracting arbitrary behavior. For example, make- list could instead be an abstract 
value represented by a contract that always produces lists of integers. The analysis would 
still succeed in proving all contracts safe except the use of car in main —this shows the 
flexibility available in choosing between precision and modularity. In addition, the analysis 
does not have to be perfectly precise to be useful. If it successfully verifies most contracts 
in a module, that already greatly improves confidence about the module’s correctness and 
justifies the elimination of numerous expensive dynamic checks. 

2.6 First-class contracts 

In the following, we choose a simple encoding of classes as functions that produce objects, 
where objects are again functions that respond to messages named by symbols. We then 
verify the correctness of a mixin : a function from classes to classes. The vec/c contract 
enforces the interface of a 2D-vector class whose objects accept messages ' x, ' y, and 1 add 
for extracting components and vector addition. 

(define vec/c 

([msg : (one-of/c 'x 'y 'add)] 

-> (match msg 
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[(or 'x 'y) real?] 

['add (vec/c -> vec/c) ]))) 

This definition demonstrates several powerful contract system features which we are able 
to handle: 

• contracts are first-class values, as in the definition of vec/c, 

• contracts may include arbitrary predicates, such as real?, 

• contracts may be recursive, as in the contract for 1 add, 

• function contracts may express dependent relationships between the domain and 
range—the contract of the result of method selection for vec/c depends on which 
method is chosen. 

Suppose we want to define a mixin that takes any class that satisfies the vec/c interface 
and produces another class with added vector operations such as ' len for computing the 
vector’s length. The extend function defines this mixin, and ext-vec/c specifies the new 
interface. We verify that extend violates no contracts and returns a class that respects 
specifications from ext-vec/c. 

(extend : (real? real? -> vec/c) -> (real? real? -* ext-vec/c)) 

(define (extend mk-vec) 

(A (x y) 

(let ([vec (mk-vec x y)[) 

(A (m) 

(match m 
[ 'len 

(let ([x (vec 'x)] [y (vec 'y)]) 

(sqrt (+ (* x x) (* y y))))] 

[_ (vec m)]))))) 

(define ext-vec/c 

([msg : (one-of/c 'x 'y 'add 'len)] 

-> (match msg 

[(or 'x 'y) real?] 

['add (vec/c - vec/c)] 

['len (and/c real? (>/c 0))]))) 

To verify extend, we provide an arbitrary value, which is guaranteed by its contract to be 
a class matching vec/c. The mixin returns a new class whose objects understand messages 
' x, ' y, ' add, and ' len. This new class defines method ' len and relies on the underlying 
class to respond to ' x, ' y, and 'add. Because the old class is constrained by contract vec/c, 
the new class will not violate its contract when responding to messages ' x, ' y, and ' add. 

For the ' len message, the object in the new vector class extracts its components as 
abstract numbers x and y, according to interface vec/c. It then computes their squares and 
leaves the following information on the heap: 

x 2 i—^ {real?,(=/c (* x x))} 
y 2 {real?, (=/c (* y y) )} 

s i-4 {real?, (=/c (+ x 2 y 2 ))} 
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Solvers such as Z3 (Moura and Bjorner 2008) can handle simple non-linear arithmetic and 
verify that the sum s is non-negative, thus the sqrt operation is safe. Execution proceeds 
to take the square root—now called l—and refines the heap with the following mapping: 

1 i->- {real?,(=/c (sqrt s))} 

When the method returns, its result is checked by contract ext-vec/c to be a non-negative 
number. We again rely on the solver to prove that this is the case. 

Therefore, extend is guaranteed to produce a new class that is correct with respect to 
interface vec-ext/c, justifying the elimination of expensive run-time checks. In a Racket 
program computing the length of 100000 random vectors, eliminating these contracts re¬ 
sults in a 100-fold speed-up. While such dramatic results are unlikely in full programs, mea¬ 
surements of existing Racket programs suggests that 50% speedups are possible (Strickland 
et al. 2012). 


3 A Symbolic Language with Contracts 

In this section, we give a reduction system describing the core of our approach. Symbolic 
Ac is a model of a pure functional language with first-class contracts and symbolic values. 
We first present the semantics, including handling of primitives and unknown functions, 
that facilitates finding bugs and constructing test cases reproducing each reachable contract 
violation. We then describe how the handling of primitive values integrates with external 
solvers. F inally, we show an abstraction of our system to accelerate convergence, turning the 
bug-finding semantics into a practical verification. For each abstraction, we relate concrete 
and symbolic programs and prove a soundness theorem. 

At a high level, the key idea of our semantics is that abstract values behave non-determin- 
istically in all possible ways that concrete values might behave. Furthmore, abstract values 
can be bounded by specifications in the form of contracts that limit these behaviors. As a 
result, an operational semantics for abstract values explores all the ways that the concrete 
program under consideration might be used. 

Given this operational semantics, we can then examine the results of evaluation to see 
if any results are errors blaming the components we wish to verify. If they do not, then 
our soundness theorem implies that there are no ways for the component to be blamed, 
regardless of what other parts of the program do. Thus, we have verified the component 
against its contract in all contexts. We make this notion precise in section 3.6. 


3.1 Syntax of Symbolic A c 

Our initial language models the functional core of many modern dynamic languages, ex¬ 
tended with behavioral, first-class contracts, as well as symbolic values. The abstract syntax 
is shown in figure 1. Syntax needed only for symbolic execution is highlighted in gray; we 
discuss it after the syntax of concrete programs. 

A program P is a sequence of module definitions followed by a top-level expression 
which may reference the modules. Each module M has a name H and exports a single 
value V with behavior enforced by contract V c . (Generalization to multiple-export modules 
is straightforward.) 
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Programs 

EQ ::= 

A \e 

Modules 

M ::= 

(module// V c V) 

Expressions 

E,C ::= 

A\X h \E E h \Ol? H I if EEE I C-XX.C 


1 

mon H (C,Z?) | assumed?V) 

Answers 

A :: = 

V | blame^ 

Values 

V ::= 

U | L 

Concrete Values U ::= 

n | A X.E | V-.A X.C 

Operations 

0 ::= 

0? | addl | + | = | ... 

Predicates 

0? :: = 

zero? | int? | proc? | dep? 

Variables 

x,H e 

identifier 

Addresses 

l e 

address 


Fig. 1. Syntax of Symbolic Ac 

Expressions include standard forms such as values, variable and module references, 
applications, and conditionals, as well as those for constructing and monitoring contracts. 
Contracts are first-class values and can be produced by arbitrary expressions. For clarity, 
when an expression plays the role of a contract, we use the metavariable C, rather than E. 
A dependent function contract (C-*AX.C / ) monitors a function’s argument with C and its 
result with the contract produced by applying A X.C' to the argument. 

A contract violation at run-time causes blame, an error with information about who 
violated the contract. We write blame^,, to mean module H is blamed for violating the 
contract from H". The form (mon^;^ ( C,E )) monitors expression E with contract C, with 
H being the positive party, H' the negative party, and H" the source of the contract. The 
system blames the positive party if E produces a value violating C, and the negative one if 
E is misused by the context of the contract check. To make context information available 
at run-time, we annotate references and applications with labels indicating the module they 
appear in, or f for the top-level expression. For example, EL H denotes a reference to the 
name H from module H' , and (addl X )' denotes an addition inside the top level. When a 
module // causes a primitive error, such as applying 5, we also write blame^, indicating 
that it violates a contract with the language. Monitoring forms, blaming forms, and labels 
are not available for programmers to write. We omit labels when they are irrelevant or can 
be inferred. 

Concrete values U include abstractions, integers, and dependent contracts with domain 
components evaluated. We use 0 to indicate falsehood and any other value for truth. Prim¬ 
itive operations over values are standard, including predicates On for dynamic testing of 
data types. 

To reason about absent components, we equip Ac with unknown, or symbolic values, 
which abstract over multiple concrete values exhibiting a range of behavior. Each ad¬ 
dress L identifies an arbitrary but fixed and syntactically closed 2 value in the program. For 
soundness, execution must account for all possible concretizations of unknown values, and 
reduction becomes non-deterministic. As execution progresses through tests and contract 

2 For example, L cannot be instantiated by tenn (Ax.y) 
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checks, more assumptions can be made about symbolic values in each non-deterministic 
branch. To track refinements of symbolic values, we use a heap that maps each address to 
a refmable value, which includes concrete as well as abstract values of the form • u and 
case[V i — y L\. We omit displaying this predicate set when it is empty, irrelevant, or can be 
inferred from context. The form • 1 ' denotes a value known to satisfy contract set Tf but 
is otherwise unknown. The form case[V i— > t] is used internally and denotes a mapping 
between values, which we discuss further in section 3.2.3. 

Refined Values U' ::= U \ *^l | case[V >->■ 2] 

Heaps E (L, U'). 


3.2 Semantics of Symbolic Xq 

We now turn to the reduction semantics for Symbolic Ac, which combines standard rules 
for untyped languages with behavior for unknown values. Reduction is defined as a relation 
on states, parameterized by a module context. We omit the module context whenever it is 
irrelevant. 

h g i—> q' 

A state is an expression paired with a heap: 

States q ::= (E, E). 


3.2.1 Basic rules 

The first reduction rule concerns the application of primitive operations, which are inter¬ 
preted by a 5 relation. The relation maps operations, arguments and heaps to results and 
new heaps. 

Apply-Primitive 

5(E,0,^) 3 g 

(0V)X^q 

The use of a 8 relation in reduction semantics is standard, but typically it is a function and 
is independent of the heap. We make 8 dependent on the heap in order to use and update the 
current set of invariants; we make it a relation, since it may behave non-deterministically on 
unknown values. For example, in interpreting (> L 5) where L i— > • lnt '. the 8 relation 
will produce two results: 1, with an updated heap to reflect the unknown value is (>/c 
5); the other 0, with a heap reflecting the opposite. The 8 relation is thus the hub of the 
verification system and a point of interaction with the SMT solver. It is described in more 
detail in section 3.3. 

Applications of A-abstractions follow standard fi -reduction; applications of non-func¬ 
tions result in blame. 

Apply-Function Apply-Non-Function 

c5(E,proc?,V) 9 (0, t) 

(VV') fl Ei—> blame^E' 


(XX.E V),E i— >[V/X]E ,E 
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Notice that in rule Apply-Non-Function, the 8 relation is employed to determine whether the 
value in operator position is a function using the proc? primitive. (Non-functions include 
concrete numbers as well as abstract values known to exclude functions; application of 
abstract values that may be functions is described in section 3.2.3.) 

Conditionals treat values other than 0 as true. 

If-True If-False 

5(E, zero?, V) 9 (0, l!) 5(E,zero?, V) 9 (1, E') 

if VE l E 2 ,Z> — >E u Zf \fVE l E 2 ,'Z >— > E 2 X 

Just as in the case of Apply-Non-Function, the interpretation of conditionals uses the 8 
relation to determine whether zero? holds, which takes into account all of the knowledge 
accumulated in E and in either branch that is taken, updates the current knowledge to reflect 
whether ze ro ? of V holds. This is the mechanism by which control-flow based refinements 
are enabled. 

The two rules for module references reflect the approach in which contracts are treated 
as boundaries between components (Dimoulas et al. 2011): a module self-reference incurs 
no contract check, while cross-module references are protected by the specified contract. 

Module-Self-Reference Module-External-Reference 

(module//V e V) £ (modul eHV c V) H^H' 

Al ^H h , Ei—> V,E Ifj h H h ',Z >—» mon^’ tf/ (V c , V) ,E 

Finally, any state that is stuck with blame inside an evaluation context transitions to a 
final blame state that discards the surrounding context. 

Halt-Blame 

#[blame],E i—> blame,E 
Evaluation contexts are defined as follows: 

<?::= [] | SE | V S\ gt | if SEE | mn(S,E) j mon(V,<?) | g^XX.E 


3.2.2 Contract monitoring 

Contract monitoring follows existing operational semantics for contracts (Findler and Fell- 
eisen 2002), with extensions to handle and refine symbolic values. 

There are several cases for checking a value against a contract. If the contract is not a 
function contract, we say it is flat, denoting a first-order property to be checked immedi¬ 
ately. We thus expand the checking expression to a conditional. 

Monitor-Flat-Contract 

g(E,dep?,V c )9(0, g) E'hy:V c ? 

mon^;^ (V C ,V ) ,E i —> if ( V c V) assume(y,V c ) blame^.E' 

Since contracts are first-class, they can also be abstract values; we rely on 8 to determine 
whether a value is a flat contract by using (the negation of) the predicate for dependent 
contracts, dep?, instead of examining the syntax. This rule is standard except for the use of 
assume^y.) andthe(-h-: -?)judgment. The assume(V,V c ) form, which would normally 



ZU064-05-FPR paper-jfp 20 March 2016 10:4 


Higher-order symbolic execution for contract verification and refutation 17 

just be V, dynamically refines address V in the heap to indicate that V satisfies V e ; assume 
is discussed further in section 3.2.3. The judgment If h L : V ?, which would normally just 
be omitted, indicates that the contract V cannot be statically judged to either pass or fail 
for L, which is why the predicate must be applied. This judgment and its closely related 
counterparts (•(-•:•/) and (• h • : • X), which statically proves a value must or must not 
satisfy a given contract respectively, are discussed in section 3.4. 

If a flat contract can be statically proved or refuted, monitoring can be short-circuited. 

Monitor-Proved Monitor-Refuted 

g(E,dep?,y c )9(0,E') I'\~V \V C S g(E,dep?,V c ) 9 (0, if) l'\-V:V c )( 

mon(V c , V) ,E i— > V,E' mon (V Cl V ),E i —» blame, E' 

Monitoring a function contract against a function is interpreted the standard 17 -expansion 
of contracts, where we swap the blame roles of positive and negative parties (Findler and 
Felleisen 2002). Similar to other values, function contracts can be either concrete or sym¬ 
bolic. As we later shown in the definitions of 8 and helper metafunction refine, when a 
symbolic value is assumed a dependent contract, we decompose it into 2 other symbolic 
values identifying its domain and range. 

Monitor-Function-Contract 

_ g(E,proc?,y) 9 (l,E / ) _ 

mon";"' (V c *XX.C,V) ,E 1 —» AX.mon";,"' (C, (V mon"';" (V C ,X))),!' 

Mon it or-Abstract-Function-Con tract 

<5(E,proc?,V) 9 (1,E') <5(E',dep?,L) 9 (1, E") E"(L) = V c ^lX.C 

mon(L,V),Ei—s-AX.mon";," (C, (V mon"„’"(V c ,X)) ),E" 

Finally, monitoring a function contract against a non-function results in an error blaming 
the party providing the value. 

Monitor-Non-Function 

(5(E,dep?,V c ) 9 ( 1 , Ei) 5(E 1 ,proc?,V) 9 ( 0 , E 2 ) 
mon";" (V c ,V),Ei—blame^„,E 2 


3.2.3 Handling unknown values 

The assume form updates the heap of refinements to take into account the new information 
using the refine metafunction; see figure 2 for the definition of refine. 

Assume 


assume(V,y c ),E i —> V, refine (I, V,V C ) 

Refinement is straightforward propagation of known contracts, expanding values known to 
be function contracts (via dep?) into function contract values. 

Finally, we must handle application of unknown values. Notice that in the presence of 
higher-order arguments, the obvious solution of using a table to model each unknown 
function does not work. First, a higher-order function interacts with its context not only 
through its returned result, but also the values it supplies to its functional arguments. Using 
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a table would omit the latter means through which the unknown function triggers errors in 
its context. Second, there is no obvious choice of equality between higher-order values for 
use as table keys. For example, it is not clear whether (A (x) x) and (A (y) y) should 
be the same key. Third, there is no direct connection between a higher-order keyed table 
and a A-term: a naive construction does not yield the intended function. The following A- 
term indeed would only execute the else clause reguardless of its argument, because a 
comparison to a function literal in most languages is guaranteed to return false. 


(A (f) 

(cond [(equal? f (A (x) x)) 
[(equal? f (A (y) y)) 
[else ...])) 


. .,#|dead code]#...] 
. .,#|dead code]#...] 


Therefore, instead of viewing a higher-order function as a mapping, we consider different 
ways in which it interacts with the known components of the program. Even though there 
are numerous ways to instantiate a /.-term, a function only interacts with its context in a 
few specific ways. For example, it is not neccessary to consider unknown components that 
perform unnecessary computations, have their own errors, or diverge: for each function 
with such behavior, there exists another terminating, error-free function that explores no 
fewer contract violations in concrete modules. 

We therefore refine each unknown function to have a specific shape shown in rule Apply- 
Unknown. The unknown function dynamically inspects its argument’s datatype to perform 
an appropriate operation. If the argument is a first-order value, we model the function as a 
table using the case[V i —> t] form discussed next. If the argument is a function, the unknown 
function can interact with it in several ways: (1) apply the function to an unknown value 
then pass the result to another unknown “continuation”, (2) delay the exploration of the 
function’s behavior and return a value depending on this function, (3) ignore the function 
and return a value independent of it. We use addresses Lf,L g , L x ,L a for new symbolic values 
that this unknown function decomposes to, and symbolic values £i ,£2 to encode the non- 
deterministic (but remembered) choices. 

Apply- Unknown 

c5(£,proc?,£) 9 (1, Ei) £i(£) = * 

£2 = £ 1 [£ 1 1 ? •, L 2 1 ? •, Lf 1—2 •, £g 1 ? •, L x 1 2 •, L a 1 —y •, £ 1 2 c a s 6 [ ], £ 1 2 XX .E ] 
where E = if (proc? X) (if £1 ((£/ (X L x )) X) (if £2 A Y. ( (L g X) Y) L a )) (£' X) 

LVjYi —> [V/X]E,Y,2 


Finally, finite maps of the form case[V 1 —y 2] on first-order values are used internally 
by the execution. Application rules are straightforward as shown in rules Apply-Case-1 
and Apply-Case-2 : if the application has been seen before, we reuse the result address, 
otherwise we return a fresh address and remember the new result in the table. 
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refine(L,L, dep?) = 1\L X H- »,L C »,L (L x ~\X.(L c X))] 

where E(L) = • and L X ,L' C ^ domlfL ) 
refine(L,L, V,) = E[L i-> where E(L) = 

refine(L,L,V) = £ otherwise 

Fig. 2. Refinement for Symbolic Ac 


Apply-Case-1 

E(L) = case[..., V' >-9 if ,...] <5(E,=, V V') 9 <1, E'> 

Apply-Case-2 

E(L) = case[y' 1-9 l' ...] 

<5(E,=, V„ V,') 9 (0, E') for all V[ e {V'. ..} L„ (f domfL) 
L V n , E i—^ A , E[L V i—^ •.Z. i—case[V i—^ Z. ..., i—^ Z. M ]] 


3.3 Primitive operations 

Primitive operations are the primary place where unknown values in the heap are refined, 
in concert with successful contract checks. Figure 3 shows b’s definition. 

The first four rules cover primitive predicate checks. Ambiguity never occurs for con¬ 
crete values, and an abstract value may definitely prove or refute the predicate if the avail¬ 
able information is enough for the conclusion. If the proof system cannot decide a definite 
result for the predicate check, 8 conservatively includes both answers in the possible re¬ 
sults and records assumptions chosen for each non-deterministic branch in the appropriate 
heap. Rules for partial functions such as addition and integer equality, which fail when 
given non-numeric inputs, reveal possible refinements when applying. This mechanism, 
when combined with the SMT-aided proof system given below, is sufficient to provide the 
precision necessary to prove the absence of contract errors. 


3.4 SMT-aided proof system 

Contract checking and primitive operations rely on a proof system to statically relate val¬ 
ues and contracts. We write E b V : V c / to mean value V satisfies contract V c , where all 
addresses in V are defined in E. In other words, under any possible instantiation of the 
unknown values in E, it would satisfy V c when checked according to the semantics. On 
the other hand, E b V : V C X indicates that V definitely fails V c . Finally, E b V :V C ? is a 
conservative answer when information from the heap and refinement set is insufficient to 
draw a definite conclusion. The precision of our analysis depends on the precision of this 
provability relation—increasing the number of contracts that can be related statically to 
values prunes spurious paths and eliminates impossible error cases. 
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Pred-True Pred-False 

Zb V-.OnV Zb V-.OiF 

S(Z,0?,V)9(l,E> 5(Z,0?,V) 9 (0, E) 

Pred-Ambig-False 

EbV:0 ? ? 

<5(Z,0?,V) 9 (0, refine( E,V,0?)) 

PI us-Error-1 PI us-Error-2 

g(£,int?,Vi)3(0,E 1 ) g(£,int?,Vi) 9 (1, E t ) g(£i,int?,V 2 ) 3 (0, £ 2 ) 

5(E, +, Vi, V 2 ) 3 <blame A , Zi) S(Z, +, Vi, V 2 ) 3 (blame A , Z 2 > 

P1 1 / c— A nvtvfl /"/ 

g(£,int?,Vi)3(l,E 1 ) 8(£i,int?,V 2 )3(l,E 2 ) V t ^ „ or y 2 ± „ 

5(Z,+,Vi,V 2 )3(» {int? ’ <=/C <+ Vl ^ ))} ,Z 2 > 


Pred-Ambig- True 

Zb V :Onl 

5(Z,0?,V) 9 (1, refine{ E,V,0?)} 
Plus-Concrete 


S{L,+,nun 2 ) 9 («i+n 2 , Z) 


Eq-Error-1 

5(Z,int?,V 1 ) 3 <0, Ei) 
5(Z,=,V 1 ,V 2 ) 3 (blame A ,Z 1 ) 


Eq-Error-2 

g(Z,int?,Vi) 9 (1, E t ) ^(Z 1 ,int?,V 2 ) 9 (0, E 2 ) 
5(Z,«,V 1 ,V 2 )3(blame Al Z 2 ) 


Eq-True 

5(Z,int?,V 1 ) 9 (1, Ei) 5(Z!,int?,V 2 ) 9 (1, E 2 > Z 2 b Vi : (=/c V 2 ) / 



«(Z,=,V 1) V 2 )3(0,E 2 ) 


Eq-False 

8( Z,int?,V 1 )3(l,E 1 ) 

5(Z 1 ,int?,V 2 )3(l,E 2 ) 

Z 2 b V! : (—/c V 2 )X 


5(Z,=,V 1 ,V 2 )9(0.E 2 ) 


Eq-Ambig-True 

8( Z,int?,V 1 )9(l,E 1 ) 

5(Z 1 ,int?,V 2 )3(l,E 2 ) 

Z 2 bVi : (=/c V 2 )? 

8{ Z,=,Vj,V 2 ) 9 (0, refine{ E 2 ,Vi, (=/c V 2 ))> 

Eq-Ambig-False 

5(Z,int?,V 1 )3(l,E 1 ) 

5(Zi,int?,V 2 )9(l,E 2 ) 

Z 2 bVi : (=/c V 2 )? 


5(Z,=,Vi,V 2 ) 9 (0, refine(Z 2 ,Vi, (*/c V 2 ))} 


Fig. 3. Basic Operations 


5 .1 Basic proof system 

A simple proof system (figure 4) can be obtained which returns definite answers for con¬ 
crete values, uses heap refinements, and handles negation of predicates and disjointness of 
data types. We abbreviate XX. (() > X ) as 0 >. 

Notice that the proof system only needs to handle a small number of well-understood 
contracts. We rely on evaluation to naturally break down complex contracts into smaller 
ones and take care of subtle issues such as divergence and crashing. By the time we have 
£(L) = » v , we can assume all contracts in V' have terminated with success on L. With 
these simple and obvious rules, our system can already verify a significant number of 
interesting programs. With SMT solver integration, as described below, we can handle far 
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ZbZ (L):V C S ZbZ (L):V C X 

Z I ~L:V C S L\~ L:V C X Zbiz : int?S Z b XX. E : proc? / 

Z b V 7 : int? / Z b V : (=/c 0)/ 

Zh V ->AX.C : dep? / Zb F : zero?/ 

ZbF:int?/ ZbF:(=/c0)X ZbV:int?X 

ZbF:zero?X ZbF:zero?X 

Z b V : Oi / On f On On, On € {int?, proc?, dep?} 

Zb V : O? X ZbTM^TTv^/ 

ZbV:AX.£/ Zb V: AX .EX Z(V) = {... AX.(zero? E).. .} 

Zb V:AX.(zero?£)X Z b V : AX.(zero? E) / Zb V : AX.EX 

Z b F : V c / is not derivable Zb V : V c X is not derivable 
ZbV:V c ? 

Fig. 4. Basic Proof System 


more interesting constraints, including relations between numeric values, without requiring 
an encoding of the full language. 


3.4.2 Integrating an SMT solver 

We extend the simple provability relation by employing an external solver. 

We first define the translation {{•}} from heaps, address-value pairs, and address-contract 
pairs into formulas in solver S: 


{{(Z?w5}} 

{{L^n}} 

{{Lq : (>/c Vi)}} 
{{L: (=/c (+ Vo Vi))» 


(A {{L^V}}) 

L - n 

Lq > Vi 
L = Vq + Vi 


The translation of a heap is the conjunction of all formulas generated from translatable re¬ 
finements. The function is partial, and there are straightforward rules for translating specific 
pairs of (L : C) where C are drawn from a small set of simple, well-understood contracts. 

This mechanism is enough for the system to verify many interesting programs because 
the analysis relies on evaluation to break down complex, higher-order predicates. Not hav¬ 
ing a translation for some contract C only reduces precision and does not affect soundness. 

Next, the extension (bj) is straightforward. The old relation (b) is refined by a solver 
S. Whenever the basic relation proves E b L : C ?, we call out to the solver to try to either 
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prove or refute the claim: 

{{£}} A : V c }} is unsat {{E}} A {{V : V c }} is unsat 

E b 5 V : y e / E b s V : V c X 

The solver-aided relation uses refinements available on the heap to generate premises {{E}}. 
Unsatisfiability of {{E}} A : C}} is equivalent to validity of {{E}} => {{V : C}}, hence 
value definitely satisfies contract C. Likewise, unsatisfiability of {{E}} A {{V : C}} means V 
definitely refutes C. In any other case, we relate the value-contract pair to the conservative 
answer. 


3.5 Program evaluation 

We give a reachable-states semantics to programs: the initial program P is paired with 
an initial heap that maps each address in the program to a fully opaque value, and eval 
produces all states in the reflexive, transitive closure of the single-step reduction relation 
closed under evaluation contexts. 


eval : P -4 &(q) 

eval (0E) = {q \ll b (£';£), E 0 i—g} 
where E' = amb({l, (L/, //)}), (module//y c y) G ill 
and Eo = {L >->• • | L appears in P} U {L/, M- •} 

and amb{E} = E\ amb{Ej,E. ..} = if L, £, ( amb{E.. .}), for each fresh address Lj 

Modules with unknown definitions, which we call opaque, complicate the definition of 
eval, since they may contain references to concrete modules. If only the main module 
is considered, an opaque module might misuse a concrete value in ways not visible to the 
system. We therefore apply an unknown function to each concrete module before evaluating 
the main expression. 


3.6 Soundness of abstract semantics 

A program with unknown components is an abstraction of a fully-known program. Thus, 
the semantics of the abstracted program should approximate the semantics of any such 
concrete version. In particular, any behavior the concrete program exhibits should also be 
exhibited by the abstract approximation of that program. 

However, we must be precise as to which behaviors are relevant. Suppose we have a 
single concrete module that links against a single opaque module. The semantics of this 
program should include all of the possible behaviors, both good and bad, of the known 
module assuming the opaque module always lives up to its contract. We exclude from 
consideration behaviors that cause the unknown module to be blamed, since it is of course 
impossible to verify an unknown program. In other words, we try to verify the parts of 
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the program that are known, assuming arbitrary, but correct, behavior for the parts of the 
program that are unknown. 3 

For this reason, the precise semantic account of blame is crucial. The demonic context 
can introduce blame of both the known and unknown modules; since we can distinguish 
these parties, it is easy to ignore blame of the unknown context. 

In the remainder of this section, we formally define the approximation relation and show 
that evaluation preserves this relation, i.e. if program q is an approximation of program 
p ( p is like q but with no unknown), then the evaluation of q is an approximation of the 
evaluation of p. 


3.6.1 Approximation 

We write g' C g to mean “g approximates g'f or “g' instantiates gf which intuitively means 
g stands for a set of states including g' . For example, (1,0) C (L, {L >->■ •}). Because 
we restrict the instantiating side to contain no symbolic value, the heap is irrelevant, we 
abbreviate (Ef 0) C (E, E) as E' C (E, E) and (E[, 0) i—» (Ef 0) as E[ \—> E' 2 . 

Consistent instantiation of symbolic values In order to enforce that each symbolic value 
is instantiated by one concrete value, we parameterize the relation with a fully concrete 
heap indicating the instantiation of each symbolic value. For example, expression (+ 12 ) 
instantiates ((+LiL 2 ), {L\ i—»• •.Li H > •}), parameterized by {Li i—>• 1 , 1,2 K > 2}. A naive 
definition of the approximation without this parameter would admit a weaker approxi¬ 
mation relation not preserved by reduction, where different sub-expressions instantiate 
symbolic values differently. For example, in the following, suppose we admitted that E' C 
(E, E) by straightforward structural induction without enforcing consistent instantiation of 
labels (because 0, 1, 2 each refines (L, {L •}) individually), we would need to prove 
that their next states preserve the relation. 

E' = (if 0 1 2) 

(£■', E) = ((if L L L),{L >-►•}) 

The next abstract state, however, does not continue to approximate the concrete one: 

E' i —> 2 

(E,{L^. }) ► (L, {L i—>■ 0}) 

With a parameter enforcing consistent instantiation of symbolic values, we prevent this “ac¬ 
cidental” approximation to establish. In the above example, since there is no instantiation 
E' such that E'(L) = 0 and E'(L) = 2, we cannot derive that E' (£, E) in the first place. 
Instead, in the following example, E' C z (£,E), where E' = {Lq i-> 0,Li >->• 1,L 2 ^ 2}: 

E' = (if 0 1 2) 

E = (if Lq L\ L 2 ) E = {L 0 f »,Ei >->• »,L 2 •} 


3 Equivalently, we can think of the execution as implicitly blaming each unknown component for 
each possible error with a trivially constructed counterexample. 
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Omitting behavior from unknown components Our soundness result does not consider 
additional errors that blame unknown modules, and therefore we parameterize the approx¬ 
imation relation with the module definitions M to select the opaque modules. We omit 
these parameters where they are easily inferred to ease notation. 

Figure 5 shows the definition of Each concrete value is approximated by a symbolic 
value if the heap gives no restriction on the symbolic value’s behavior. Further, if the 
concrete value is known to satisfy a contract, adding that contract to the abstract value 
preserves the approximation. We write £(f/) to mean a straightforward instantiation of 
all symbolic values in U according to heap £. We extend the relation C?j structurally to 
evaluation contexts S, point-wise to sequences, and to sets of program states. 

Instantiating unknown components Finally, we justify our choice of instantiating un¬ 
known functions to only one specific shape, and show that it is sufficient to approximate 
all possible interactions between the known and unknown program components. 

Lemma 1 (Canonical counterexample) 

IfV and V' come from different modules, (module/ZVcy') € aI, and (VVf i—>*Awhere 
A is a value or blame^, there exists A X.E such that (A X.E V') i —>* A and A X.E conforms 
to V r in the following grammar: 

V r ::= AX.(if (proc? X) (ifn ( (V r (X V)) X) V) (V m X)) 

V m ::= AX.(if { = Xn)V (V m X)) | AX.V 

Proof 

Without loss of generality, assume V is non-recursive, bug-free, and does not introduce 
divergence of its own. (If V is recursive, we unroll it as many time as needed to reproduce 
the finite trace when applied to V'. Further, V’s own bug or potential divergence must not 
have affected the result of its application to V', so we replace the corresponding source code 
with trivial expressions.) 

If the function body E can be decomposed into an evaluation context and a redex #[£'], 
without loss of generality, we only consider cases where E' contains X and depends on an 
actual value ofX to reduce. (Otherwise, because E' does not contain divergence or error of 
its own, we can safely “partially evaluate” E' to eliminate any redundant redex.) 

We therefore translate E for the following cases. Translation {{£}}-^- behaves identically 
to E up to the finite value set V' that the free variable X in E can have. The translation 
terminates by decreasing on £’s size. 

• Case E = <?[if X E\ Ef\ 

{{£}} = (if (p roc? X) £j ( V m X )), where 
— (if (proc? X) E[E' 2 ) = {{<?[£i]}} 

— V, n is a table approximating Ei over V" for X. (because £2 does not have errors 
and divergence, and X is known to be numbers, the evaluation of £2 f° r these 
particular values ofX is guaranteed to terminate.) 

• Case £ = <o[X V]: 

{{£}} = (if (proc? X) (((AZ.AX.{{#[Z]}}) (XV)) X) (AX.OX)) 

• Case £ = <?[proc? X]: 

Case analysis on § 
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— Case $ = []£j (V m X)]: 

{{£}} = if (proc? X) E[E 2 

, where (if (proc? X) E[ E") = {{<?'[£i]}} and V m is an approximation of £2 over 
for X. 

— Case <§ = £”[0 V 1 ... [ ] V' ...]: Because (proc? X) only evaluates to either 0 or 
1, there are 3 cases: 

- (OV... [] V'...) preserves the truth of (proc? X): then 

{{£» = {{^[proc?X]» 

- (OV'... [] V'...) negates the truth of (proc? X): then 

{{£}} = (if (Proc? X)E[E' 2 ) 

where (if (p roc? X) E[ E") = {{<T [0]}} 
and (if (proc? X) E'Elf) = {{<T[l]}} 

- (OV'... [ ] V'...) has constant truth reguardless of (proc? X ): then {{£}} 
= {{<^'[0]}} or {{#'[1]}} depending on the constant truth. 

— Case £’=[]: {{£}} = (if (proc? X) 1 (XX.Q X )) 

• Case E = ^[int? X): Similar to the previous case but with the clauses reversed. 

• Case E = /[OV'... XV'...} , where V' ::= V | X: 

Because E is bug-free and divergence-free by assumption, and X is first-order, {{£}} 
= (if (proc? X) 0 (V m X )) , where V m is constructed as a table mapping each value 
Vi in V to the evaluation of [Vj/X\E (which terminates). 

• Case E =X: {{£}} = if (proc? X) XX 

• Case E = V: {{£}} = if (proc? X)VV 

□ 

With the definition of approximation in hand, we now state the main soundness lemma 
for the system, which is the basis for relative completeness of counterexamples (3.7) and 
soundness of contract verification (3.8). 

Lemma 2 (Soundness of reduction relation) 

If E[ (Ei, Ei) and E[ 1 —» E' 2 , then (£ 1 , Ej) 1 —>* (£ 2 , E 2 ) such that £( (£ 2 , E 2 ) 

and L" is consistent with E', for some £ 2 , E 2 , and E". 

We defer all proofs to the appendix for space. 

3 .7 Soundness and relative completeness of counterexample generation 

The semantics of Ac accumulates a first-order path-invariant as standard in first-order sym¬ 
bolic execution. In addition to this, it also refines the shape of unknown higher-order values. 
When an evaluation reaches an error state, we query the SMT solver for a model to all first- 
order values. Plugging this instantiation of first-order values into the heap directly gives 
us an instantiation of all originally omitted values that reproduces the error. An unknown 
higher-order value with no constraint on it can be any function, particularly the identify 
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Unknown 

E(L) = • E'(L) = U 

^g M 


Unknown-Refined 
U eg (L, E) E(L) = i 


0 h 17 : £'({/') / 




Lambda-Unknown 


E(L) = XX.E E' eg (£, E) 
XX.E' cS (L, E) 

M ' ' 


Blame-Ignored 

(modul eHV c L) or H 6 {t, A} 


blameff, eg (£, E) 


If Lambda 

_ E| <-E|.£> ^cgjfrj) £ 'E§{E.E) 

»cE(«,E) if£'£jE;CE(if££i£ 2 ,£) AX.£' cE (AX.£, E) 


App 

E[U^(E U L) ^c|<£ 2 ,E) 

Prim 

E\ eg (Ei, E), for each E\ e E',E t <= 

(E[E' 2 ) c|«£i£ 2 ),E) 

( OE' ) eg (lot), E) 

Check 

C'C|<C,E> E) 

Var Blame 

mon";,"'(C',.E') eg (mon (C,£), E) 

Aeg(X,E> blame" eg (blame", E) 

Dep 

C 1 Eg (C, E) E' eg (E, E) 

E\ e e 2 v' e r 'i (v, e 2 ) 

C' - XX.E' e5 (C-XX.E, E) 

M ' ' 

E'e0 E\[L^V'] 


Fig. 5. Approximation 


function that can be simplified away. The remarkable result is that our method of finding 
counterexamples is both sound and relatively complete with respect to the underlying first- 
order SMT solver. 


Soundness of counterexamples Because we refine unknown functions to have specific 
shapes in addition to maintaining a complete path condition of first-order values, the se¬ 
mantics of Ac is a sound under-approximation of all valid program runs. Therefore, any 
valid instantiation of the path condition for a specific branch will reproduce the execution 
following that branch. In particular, an instantiation in an error branch yields a true coun¬ 
terexample triggering the contract violation of that branch. 

Theorem 1 (Soundness of Counterexamples) 

If (Ei, E[) i—>* (blame^,, £„),£' CI„, and£( = E'(Ei) thenisj i —>* blame^,. 

Relative completeness of counterexamples The abstract reduction semantics of Ac also 
provides a sound over-approximation of all possible interactions between the known and 
unknown program components, discovering every reachable error in the concrete modules 
(lemma 2). Therefore, as long as the SMT solver can construct a model to the given first- 
order formula, we can construct a higher-order function that reproduces each discovered 
error, simply by plugging in the first-order values given by the solver. 
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Theorem 2 (Relative Completeness of Counterexamples) 

If E[ i—blame^,, Ej C (£j, Ei), and there is a complete procedure for generating values 
satisfying first-order constraints, then (E\. Ei) i —>* (blame^,, E„) such that we can derive 
some E' such that E' C E„. 

3.8 From bug-finding to verification 

The semantics of Ac not only is helpful for generating test cases that reproduce contract 
violations, it also helps verification of contract-correctness. Because the existence of a 
counterexample implies the existence of a “canonical” counterexample of the form in rule 
Apply-Unknown (lemma 1), proving the absence of counterexamples of this form alone is 
equivalent to verification of the program. Unfortunately, a naive run of a program in this 
semantics does not terminate for most programs: execution unfolds indefinitely to explore 
an infinite set of instantiations to abstract values. We therefore introduce two transforma¬ 
tions that approximate the semantics of Ac to accelerate convergence, making it a practical 
verification for many programs. 


3.8.1 Approximating unknown functions 

Rule Apply-Unknown shown in section 3.2.3 unfolds and remembers the shape of each 
unknown function as execution progresses. Although this refinement is useful for construct¬ 
ing higher-order counterexamples, it is a major source of non-termination: the execution 
repeatedly generates fresh A-terms. As a more approximate execution of opaque function 
applications, we no longer unfold an unknown function upon application and replace rule 
Apply-Unknown with two non-deterministic rules: Apply-Unknown-Success returns a fresh 
address approximating an unknown result, and Apply-Unknown-Havoc passes the argument 
to a demonic context whose sole purpose is to find reachable blames in the argument: it 
repeatedly applies the argument to an unknown value, then places the value back into the 
unknown context. (Even though V may not be a function, the semantics of blames allows us 
to ignore the potentially erroneous application, which is the responsibility of the unknown 
component.) 

Apply-Unknown-Success Apply-Unknown-Havoc 

E(L) = *^ <5(E,proc?,L) 9 (1, T!) E(L) = <5(E,proc?,L) 9 (1, T!) 

L V,E i—» L'X[L' >->••] L V,E i —>L (V Z/),E'[Z/ •] 

This abstraction does not allow easy construction of concrete counterexamples in case of 
errors, and may introduce more spurious paths, but does not significantly affect precision 
in practice. Below is an example where the abstracted semantics steps to a false contract 
violation, even though the second error is not reachable. The unknown function f either 
applies or ignores its argument, but the abstraction prevents execution from remembering 
the choice in a particular branch. 

(let ([f L]) ;; where {L i—>■ •} 

(f (A (x) (/ 1 0))) 

(f (A (x) (/ 1 x)))) 
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Lemma 3 (Soundness of unknown function approximation) 

If (Vi V 2 ) E ((L V ), E) and (ViV 2 )> —» E' then ((L V), I) <—>* (E, If) such that E' C 

(E, £')• 


3.8.2 Summarizing function results 

With the abstraction as presented in section 3.8.1, the semantics still does not terminate for 
many common recursive programs. Consider the following example: 

(define (fact n) 

(if (= n 0) 1 (* n (fact (- n 1))))) 

(fact L„) 

Ignoring error cases, it eventually reduces non-deterministically to all of the following: 

1 if L n i—. 0 

(* L n 1) if L n ft 0, L n -1 i l 0 
(* L n (* L„_i (fact L„_i))) ifL„,L„_i 0-0 

where L„ \ is a fresh address resulting from subtracting L n by one. The process continues 
with L„_ 2 , L n - 3 , etc. This behavior from the analysis happens because it attempts to ap¬ 
proximate all possible concrete substitutions to abstract values. Although fact terminates 
for all concrete naturals, there are an infinite number of those: L n can be 0, 1, 2, and so on. 

To enforce termination for all programs, we can resort to well-known techniques such 
as finite state or pushdown abstractions (Van Horn and Might 2012). But often those are 
overkill at the cost of precision. Consider the following program: 

(let* ([id (A (x) x)] 

[y (id 0)] 

[z (id 1)1) 

(< V z)) 

where a monovariant flow analysis such as 0CFA (Shivers 1988) thinks y and z can be both 
0 and 1 , and pushdown analysis thinks y is 0 and z is either 0 or 1 . For a concrete, straight- 
line program, such imprecision seems unsatisfactory. We therefore aim for an analysis that 
provides exact execution for non-recursive programs and retains enough invariants to verify 
interesting properties of recursive ones. The analysis quickly terminates for a majority of 
programming patterns with decent precision, although it is not guaranteed to terminate in 
the general case—see section 4 for empirical results. 

One technical difficulty is that the semantics of contracts prevents us from using a recur¬ 
sive function’s contract directly as a loop invariant, because contracts are only boundary- 
level enforcement. It is unsound to assume returned values of internal calls can be approx¬ 
imated by contracts, as in f below. 

(f : (and/c int? (>/c 0)) -> (and/c int? (>/c 0))) 

(define (f n) 

(if (= n 0) "" (string-length (f (- n 1))))) 
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If we assume the expression ( f (- n 1 )) returns a number as specified in the contract, 
we will conclude f never returns, and is blamed either for violating its own contract by 
returning a string, or for applying st ring -length to a number. However, f returns 0 when 
applied to 1. To soundly and precisely approximate this semantics in the absence of types, 
we recover data type invariants by execution. 

We modify the application rules as follows. At each application, we decide whether 
execution should step to the function’s body or wait for known results from other branches. 
When an application (f v) reduces to a similar application, we plug in known results 
instead of executing f’s body again, avoiding the infinite loop. Correspondingly, when (f 
v) returns, we plug the new-found answer into contexts that need the result of (f v). The 
execution continues until it has a set soundly describing the results of (f v). 

To track information about application results and waiting contexts, we augment the 
execution with two global tables M and E as shown in figure 8. We borrow the choice of 
metavariable names from work on concrete summaries (Johnson and Van Horn 2014). 

A value memo table M maps each application to known results and corresponding re¬ 
finements. Intuitively, if M (£, Vf. V x ) 9 ( V, £') then in some execution branch, there is an 
application (V/V x ),Zt —»* (V,£'). 

A context memo table E maps each application to contexts waiting for its result. Intu¬ 
itively, E (L,Vf,V x ) 9 {F,U means during evaluation, some expression 

^i[( rt (£,vy,v x ) Wk[fVf 

with heap £' is paused because applying (Vf V-) under assumptions in £' is the same as 
applying (Vf V x ) under assumptions in £ up to consistent address renaming specified by 
function F. 

To keep track of function applications seen so far, we extend the language with the 
expression ( rt iz,v,v') E), which marks E as being evaluated as the result of applying V 
to V', but otherwise behaves like E. The expression (blu r/yxy) E ), whose detailed role is 
discussed below, approximates E under guidance from a “previous” value V. 

A state in the approximating semantics with summarization consists of global tables E, 
M, and a set S of explored states 

Reduction now relates tables E, M, and a set of states ~cf to new tables S', M' and a 
new set of states g'. We define a relation (S ,M, g) i —> (E,M, g), and then lift this relation 
point-wise to sets of states. Figure 7 only shows rules that use the global tables or new 
expression forms. 

In the first rule, if an application ((A X.E ) V ) is not previously seen, execution proceeds 
as usual, evaluating expression E with X bound to V, but marking this expression using rt. 

Second, if a previous application of ((A X.E) Vo) results in application of the same func¬ 
tion to a new argument V , we approximate the new argument before continuing. Relation 
, straightforwardly defined in figure 9, determines whether two states are equivalent to 
each other up to renaming F. Taking advantage of knowledge of the previous argument, 
we guess the transition from the Vo to V and heuristically emulate an arbitrary amount of 
such transformation using the ® operator. 

Third, when an application results in a similar one, we avoid stepping into the function 
body and use known results from table M instead. In addition, we refine the current heap 
to make better use of assumptions about the particular “base case”. We also remember the 
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current context as one waiting for the result of such application. To speed up convergence, 
apart from feeding a new answer V a to the context, we wrap the entire expression inside 
(blur/ f E y\ [ ]) to approximate the future result. 

The fourth rule in figure 7 shows reduction for returning from an application. Apart 
from the current context, the value is also returned to any known context waiting on the 
same application. Besides, the value is also remembered in table M. The resumption and 
refinement are analogous to the previous rule. 

Finally, expression (blu i"(f,i;,v 0 > V ) approximates value V under guidance from the pre¬ 
vious value Vo and also approximates values on the heap from observation of the previous 
case. Overall, the approximating operator © occurs in three places: arguments of recursive 
applications, result of recursive applications, and abstract values on the heap when recursive 
applications return. 

Figure 6 shows an implementation of operator © in an extended language with pairs. 
The operator approximates the right operand with guidance from the left operand. We also 
extend the syntax of values to represent inductively defined sets of values. For example, 
jUA. {empty, (• lnt ■, LX}} denotes a proper list of integers. We approximate a concrete inte¬ 
ger to an abstract one if a previous integer has been seen. (A more sophisticated implemen¬ 
tation can use more fine-grained approximations such as positive and negative integers.) 
Approximation of a pair distributes to each component if the left operand is also a pair. If 
the left operand is an inductively defined set, the new value is “merged” into the set with 
appropriate renaming or folding. If the left operand syntactically appears in the right one, we 
emulate an arbitrary number of transitions from the former to the latter with an appropriate 
inductive set. As a small precision optimization, we unroll the set once, emulating one 
or more (instead of zero or more) transitions. Finally, we return the value itself as a safe 
approximation. Notice that it is unsound to approximate an arbitrary value to •. In particular, 
we cannot approximate a concrete function to •, discarding code with potential errors to 
find. 4 A good implementation of © should allow convergence in common cases. Empirical 
results for our tool are presented in section 4. 

Soundness of summarization: A system (S ,M,S) approximates a concrete state E if we 
can recover/: from the system through approximation rules (figure 10). The first rule states 
that if any state in S approximates E, the system approximates it. The second rule states 
that if the system knows that an instantiation of (V V x ) results in a waiting context Sf, 
and E' is reachable from a (possibly different) instantiation of (V V x ), then the system 
also approximates Sf \E'}. Context S' o is an irrelevant outermost context waiting for the 
application’s result, and context frames (rt/ . \ [ ]) mark the application history. 

As a consequence, summarization properly handles repetition of waiting contexts, and 
gives results that approximate any number of recursive applications. 

With this definition in hand, we can state the central lemma to establish the soundness 
of the revised semantics that uses summarization. 

Lemma 4 (Soundness of summarization) 

4 In an implementation using environment instead of substitution, we can distribute the 
approximation to each closure’s environment’s range, obtaining approximations such as an 
inductive set of closures representing an arbitrary number of wrappings around a function. 
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Values V | empty | (V, V) \ LlX.V \ \X 


hq © n i = 


.int? 


, ifn 0 ^«l 


(Vo.Vi) © (V 2 ,V 3 ) = <V 0 © V 2 , Vi © Vi) 

gX.tf, © gY.v[ = /iV. (Vo © [!X/!K]v|) 
nx.\%&v= gX.^o®[ } -X/iiX.\^]V'i 

Vo © Vi = [1 rX.{Vo, [\X/Vq]V\}) /Vq[V\ , if Vo e, Vi 
Vq © Vi = Vi , otherwise 


where 


Fig. 6 . Approximation 


V V 

V£ s (V 0 , Vi), if V € s V 0 or V € s V\ 


g =£ g\ [ (rt <Zo t \x. E y 0 ) 4) ] for any 4,4, E 0 , V 0 
<E ,M,g[UXX.E) V)i'L)^{Z,M,g[(n^xx. E ,v) [V/X\E)],X) 

^ = ^i[(rt ( i 0 ,Ax.£,vb> S k)] for some 4,4, £o, V 0 (E, V) $ (So, V 0 ) V t = V 0 © V 

(E,M,4[(UX.E) V)],E)^(E,M,4(rt (2UWl) [V l /X]E)f'L) 

g = 4[(rt(E 0j y /jVb ) 4)] for some 4,4,£o,V 0 
(E, V) (E 0 , V 0 ) S' = EU [(E 0 ,V / ,Vo) ^ (F,E ,4,4)J 
(v fl , Ea) e M[(Eo, V f , Vo)] E' = £[L„ ^ E fl [L a t| where (L g , L„j = F 

(E,M,i#[(V/ V) ], E) i t (E',M,4[(rt (Zojy/iyo) (blur^^j 4[V„]))],E') 

(Lo, U) = F E' = E[L„ ^ Eq(Lq) © E(Lp)] 
(E,Af,4[(blu r(FiEoiVi) V)],E) ► (E,M,<f[V 0 © V],E'} 

M 1 = MU [(Ep, Vy, Vq) t-» (V, E)] 

(E,M,<r[(rt (Eojy/]yo) V)],E) —► (2,M',<f[V],E) 

M'=M U[(Eo,V / ,V 0 )-»(V,E)] 

(F,E a ,4,4) e E[(E 0 , Vy, Vq)] E'. = E^[L„ g E(zJ where (L g , L„j = F 
{Z,M,g[(rti^ Vf V ^ V)] ,E) i — > (E,M',4[(rt(£ 0jy/iyo ) (blur^ Ey ) 4[V] ) )],E^,) 

Fig. 7. Summarizing Semantics 


If E[ C (Si,Afi,Si) and £( i—then (Si,Mi,Si) i —>* such that E' 2 C 

(St,M2,S2). 

The proof is given in the appendix. With this lemma in place, it is straightforward to 
define verification as a simple corollary of soundness and prove a blame theorem. 

First we defined when a module is verified by our approach. 


Definition 1 (Verified module) 
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Expressions 
Evaluation contexts 

Context memo tables 

Value memo tables 
Renamings 


E +— (rt^y^ E) | (blur^^y) E) 
S += (rt ij^yy) S) | (blur^rj^y^ S) 

S ::= ((WM^S) 

M ::= ((£,F,F),(F^) 

F ::= <Z^ 


Fig. 8. Syntax extensions for approximation 


_ (g, S') (E, E) 

(n, £') («, E) (AX.F', E') {XX.E, E> 


{L',L)€F E'(L')=E(L) 

(/.', £'} (L, E) 


(g, g) ^ (gf, S) (g'n S') (Ex, E) 
{(E' f E' x ),l!)^ F {(E f E x ),-L) 


(E-, E'} (£), E) for each i 

{(0~E'),l!) Hot), E) 


(Ep, £') ( E 0 , E) (g, E') (gi, E) (E' 2 , H) ~ f (E 2 , E) 

(if E' 0 E[ E' 2 , £') « f (if£ 0 £t£ 2 ,E) 


(C', S') « F (C, E) (g, S') « F (E, E) (C', E') (C, E) (g, S') (g, E) 

(C'^XX.E', H) {C-XX.E, E) (mongjg (C',E'), If) * F (mon"g (C,£), E> 

(V', S') (V, E) (F c ', S') (F c , E) _ 

(assume (V' ,V ' C ), £') ~ F (assume (F, V c ) , E) (blarneys, £') (blame^, , E) 

Fig. 9. State equivalence up to renaming 


A module (module// V C V) £ P is verified in P if V fi L and eval{P) $ blamed 
Now, by soundness, H is always safe. 

Theorem 3 (Verified modules can’t be blamed) 

If a module named H is verified in P, then for any concrete program Q for which P is an 
abstraction, eval(Q) $ blame H . 


E' C (E, E) (E, L)eS 
E' C (E ,M,S) 

S(E 0 ,V,V x -) = (F,E 1 ,4),4) 

<f' 0 E(4,£i) 4*c(4,St) F'c(F,Si) FoE (Ft, El) F 1 'c(F x ,E 1 ) 

4 0 ^4i[(rt ( _ rJ S' 2 )] for any S'S' 2 4 0 [( rt^,^ E')] C (E ,M,S) 

4o[(rt (<byy£) &'k[( rt {<dy'y[) £ ')])] E (S,M,5) 

Fig. 10. Approximation of Summarizing Semantics 
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4 Implementation and evaluation 

To validate our approach, we implemented a static contract checking tool, SCV, based on 
the semantics presented in section 3. The system refutes incorrect programs with concrete 
test cases by running the semantics in section 3.2 and verifies the absence of run-time errors 
in correct programs using the abstracted semantics in section 3.8.2. In addition, there are 
a number of implementation extensions for increased precision and performance. We then 
applied SCV to a wide selection of programs drawn from the literature on verification of 
higher-order programs, and report on the results. 

The source code for SCV and all benchmarks are available along with instructions on 
reproducing the results we report . 5 Apart from being implemented as a command line tool, 
our prototype is also available as a public web REPL . 6 

4.1 Implementation extensions 

SCV supports an extended language beyond that presented in section 3 in order to handle 
realistic programs. First, more base values and primitive operations are supported, such as 
strings and symbols (and their operations), although we do not yet use a solver to reason 
about values other than integers. We support Racket’s numeric tower, which introduces 
more error sources and interesting counterexamples. Second, data structure definitions are 
allowed at the top-level. Each new data definition induces a corresponding (automatic) 
extension to the refinement of unknown functions to deal with the new class of data. The 
unknown function now also non-deterministically decomposes its argument if the argu¬ 
ment is a user-defined struct, in addition to applying functions and mapping first-order 
values as in rule Apply-Unknown. We also extend the widening operator © to heuristically 
approximate values of user-defined structs to inductively defined data, which gives good 
precision in common programs. Third, modules have multiple named exports, to handle the 
examples presented in section 2, and can include local, non-exported, definitions. Fourth, 
functions can accept multiple arguments and can be defined to have variable-arity, as with 
+, which accepts arbitrarily many arguments. This introduces new possibilities of errors 
from arity mismatches. Fifth, a much more expressive contract language is implemented 
with and/c, or/c, struct/c, p/c for conjunctive, disjunctive, data type, and recursive 
contracts, respectively. Sixth, we provide solver back-ends for both CVC4 (Barrett et al. 
2011) and Z3 (Moura and Bjorner 2008). 

4.2 Evaluating on existing benchmarks 

To evaluate the applicability of SCV to a wide variety of challenging higher-order contract 
checking problems, we collect examples from the following sources: programs that make 
use of control-flow-based typing from work on occurrence typing (Tobin-Hochstadt and 
Felleisen 2010), programs from work on soft typing, which uses flow analysis to check 
the preconditions of operations (Cartwright and Fagan 1991), programs with sophisticated 

5 github.com/philnguyen/soft-contract 

6 scv.umiacs.umd.edu 
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Corpus | Lines | Checks | Correct Variant (ms) | Incorrect Variant (ms) 


Occurrence Typing 

116 

141 

98.7 

502.8 

Soft Typing 

134 

177 

12,747.0 

331.0 

Higher-order Recursion Schemes 

527 

859 

14,190.7 (8) 

8,172.3 

Dependent Refinement Types 

69 

116 

576.7 

2,270.7 

Higher-order Symbolic Execution 

223 

308 

9,532.0(1) 

633.8 

Correct anonymous programs (22) 

158 

213 

268.6 

- 

Incorrect anonymous programs (110) 

778 

1,336 

- 

14,126.9 (5) 

Student Video Games 





Snake 

164 

246 

38,602.3 

3,034.2 

Tetris 

267 

338 

12,303.5 

2,255.0 

Zombie 

249 

476 

21,276.2 

1,152.0 


Table 1. Summary benchmark results. (See the appendix for detailed results.) 


specifications from work on model checking higher-order recursion schemes (Kobayashi 
et al. 2011), programs from work on inference of dependent refinement types (Terauchi 
2010), and programs with rich contracts from our prior work on higher-order symbolic 
execution (Tobin-Hochstadt and Van Horn 2012). We also evaluate SCV on three inter¬ 
active student video games built for a first-year programming course: Snake, Tetris, and 
Zombie. These programs were all originally written as sample solutions, following the 
style expected of students in the course. Of these, Zombie is the most interesting: it was 
originally an object-oriented program, translated using the encoding seen in section 2.6. 
Finally, we collect programs submitted anonymously by the users of our web service. 

In order to evaluate our counterexample generation, we modify many correct programs 
to introduce errors. To do so, we weakened preconditions, (wrongly) strengthened pos- 
conditions, or omitted checks before performing partial operations. For example, a resulting 
program may deconstruct a potentially empty list, compare potentially non-real numbers, 
or promise strict inequality where equality may happen in an edge case. We believe these 
changes are representative of common mistakes. 

We present our results in summary form in table 1, grouping each of the above sets 
of benchmark programs; expanded forms of the tables are provided in the appendix. The 
table shows total line count (excluding blank lines and comments) and the number of 
static occurrences of contracts and primitives requiring dynamic checks such as function 
applications and primitive operations. These checks can be eliminated if we can show that 
they never fail; this has proven to produce significant speedups in practice, even without 
eliminating more expensive contract checks (Tobin-Hochstadt et al. 2011). 

The tables report the time verifying correct programs and refuting their incorrect variants. 
Execution times are in milliseconds and measured on a Core i7 2.7GHz laptop with 8GB 
of RAM. When the tool fails to fully verify a program in the “Correct Variant” column, we 
report the number of false positives next to verification time. Similarly, when the tool fails 
to generate a concrete counterexample for a program in the “Incorrect Variant” column, we 
display the number of warnings (without concrete inputs) next to refutation time. 
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4.3 Discussion 

First, SCV works on benchmarks for a range of previous static analyzers, from type systems 
to model checking to program analysis. 

Second, most programs are analyzed in a reasonable amount of time; the longest re¬ 
maining analysis time is under 60 seconds. This demonstrates that although the termination 
acceleration method of section 3.8.2 is not fully general, it is effective for many program¬ 
ming patterns. For example, SCV terminates with good precision on last from Wright and 
Cartwright (1997), which hides recursion behind the Y combinator. 

Third, across all benchmarks, over 99% (4201/4210) of the contract checks are stati¬ 
cally verified, enabling the elimination both of small checks for primitive operations and 
expensive contracts; see below for timing results. This result emphasizes the value of static 
contract checking: gaining confidence about correctness from expensive contracts without 
actually incurring their cost. In practice, problems such as false positives and failure to 
construct a concrete counterexample do not render the tool useless for the corresponding 
programs. False positives reduce confidence about the program’s correctness and disable 
contract optimization, but programmers can still run the programs with safety guaranteed 
by the familiar contract monitoring semantics. On the other hand, even though SCV can¬ 
not construct a counterexample for some programs in practice, it always soundly reports 
potential contract violation. We discuss current difficulties in section 4.5. 

Fourth, there are specific examples where our prototype proves to be a good complement 
to random testing in discovering contract violations. For example, SCV finds a counterex¬ 
ample to the following program quickly and automatically: 

(define (f n) (/ 1 (- 100 n))) 

Be default, QuickCheck does not find this error as it only considers integers from -99 
to 99. Because QuickCheck treats a program as a black box, this conservative choice is 
reasonable for fear that the integer may be a loop variable causing the test case to run for a 
long time (Flughes 2015). In contrast, SCV explores the program’s semantics symbolically 
and discovers 100 as a good test case. 

Fifth, the resulting higher-order counterexamples suggest that SCV can produce useful 
feedback. For example, it is easy for programmers to forget that Racket supports the full 
numeric tower (St-Amour et al. 2012) and that the predicate number? accepts complex 
numbers. In the following program, argmin’s contract is in fact too weak to protect the 
function. SCV proves a rgmin unsafe by applying it to a specific combination of arguments. 
First, f is given a function that produces a non-real number. Second, xs is given a list of 
length 2, which is the minimum length to trigger a use of <. 

(f : (any/c —t number?) (and/c pair? list?) —► any/c) 

(define (argmin f xs) 

(argmin/acc f (car xs) (f (car xs)) (cdr xs))) 

(define (argmin/acc f b a xs) 

(cond 

[(null? xs) a] 

[(< b (f (car xs))) (argmin/acc fab (cdr xs))] 
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[else (argmin/acc f (car xs) (f (car xs)) (cdr xs))])) 

Contract violation: argmin violates contract with < 

Value 0+li violates contract real? 

An example that triggers this violation: 

(argmin (A (x) 0+li) (list 0 0)) 

Finally, SCV analyzes the functional encoding of object-oriented programs effectively. 
Zombie is one such example with extensive use of higher-order functions to encode objects 
and classes, and the tool can reveal errors buried in delayed function calls. We believe 
this is a promising first step for generating classes and objects as counterexamples. In the 
example below, we define interface posn/c that accepts two messages x and y, and function 
first - quadrant? that tests whether a position is in the first quadrant. The counterexample 
reveals one conforming implementation to interface posn/c that causes error in the module. 

(define posn/c 

([msg : (one-of/c 'x 'y)l 
-t (match msg ['x number?] [ 1 y number?]))) 

; posn/c —► boolean? 

(define (first-quadrant? p) 

(and (> (p 'x) 0) (> (p 1 y) 0))) 

Contract violation: first-quadrant? violates contract with < 

Value 0+li violates contract real? 

An example that triggers this violation: 

(first-quadrant? (A (msg) (case msg [(x) 0+li] [(y) 0]))) 

Overall, our experiments show that our approach is able to discover and use invariants 
implied by conditional flows of control and contract checks. Obfuscations such as multiple 
layers of abstractions or complex chains of aliases do not impact precision (a common 
shortcoming of flow analysis). 

Finally, soft contract verification is more broadly applicable than the systems from which 
our benchmarks are drawn, which typically are successful only on their own benchmarks. 
For example, type systems such as occurrence typing (Tobin-Flochstadt and Felleisen 2010) 
cannot verify any non-trivial contracts, and most soft typing systems do not consider con¬ 
tracts at all. Systems based on higher-order model-checking (Kobayashi et al. 2011), and 
dependent refinement types (Terauchi 2010) assume a typed language; encoding our pro¬ 
grams using large disjoint unions produces unverifiable results. 

This broad applicability is why we are not able to directly compare SCV to these other 
systems across all benchmarks. Instead, the Simple system serves as a benchmark for a 
system which does not contain our primary contributions. 


4.4 Contract optimization 

We also report speedup results for the three most complex programs in our evaluation, 
which are interactive games designed for first-year programming courses (Snake, Tetris, 
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and Zombie). For each, we recorded a trace of input and timer events while playing the 
game, and then used that trace to re-run the game (omitting all graphical rendering) both 
with the contracts that we verified, and with the contracts manually removed. Each game 
was run 100 times in both modes; the total time is presented below. 


Program Contracts On (ms) Contracts Off (ms) 


snake 

475,799 

59 

tetris 

1,127,591 

186 

zombie 

12,413 

1,721 


The timing results are quite striking—speedup ranges from over 5x to over 5000x. This 
does not indicate, of course, that speedups of these magnitudes are achievable for real 
programs. Instead, it shows that programmers avoid the rich contracts we are able to verify, 
because of their unacceptable performance overhead. Soft contract verification therefore 
enables programmers to write these specifications without the run-time cost. 

The difference in timing between Zombie and the other two games is intriguing because 
Zombie uses higher-order dependent contracts extensively, along the lines of vec/c from 
section 2.6, which intuitively should be more expensive. An investigation reveals that 
most of the cost comes from monitoring flat contracts, especially those that apply to data 
structures. For example, in Snake, disabling posn/c, a simple contract that checks for a 
posn struct with two numeric fields, cuts the run-time by a factor of 4. This contract is 
repeatedly applied to every such object in the game. In contrast, higher-order contracts, as 
in the object encodings used in Zombie, delay contracts and avoid this repeated checking. 

4.5 Limitations and Challenges 

We discuss current limitations of our approach and solutions in mitigating them. 

First, our approach does not yet give a way to verify deep structural properties expressed 
as dependent contracts such as “map over a list preserves the length” or “all elements in 
the result of filter satisfy the predicate”, resulting in the false positives seen in table 1. 
However, it can already be used to verify many interesting programs because often safety 
questions depend only on knowledge of top-level constructors. Examples of these patterns 
appear in programs from Kobayashi et al. (2011) for programs such as reverse (see also 
§2.5), nil, and mem. 

Second, the analysis is prone to combinatorial explosion as inherent in symbolic ex¬ 
ecution. In practice, most conditionals come from case analyses instead of independent 
alternatives, and we rely on a precise proof system to eliminate spurious paths. In addition, 
we avoid excessive state explosion as in rules Apply-Case-1 and Apply-Case-2 and defer 
state splitting until neccessary by encoding the constraint of equal inputs implying equal 
outputs during translation. Finally, modularity mitigates the problem further, as modules 
tend to be small, and contracts at boundaries help recovering neccessary precision. 

Third, the search for counterexamples can be significantly hindered by complex precon¬ 
ditions, where the input is guarded against a deep, inductively defined property. Execution 
follows different branches before begin able to generate a valid input to continue verifying 
the module. A naive breadth-first search is bogged down by a large frontier resulting from 



ZU064-05-FPR paper-jfp 20 March 2016 10:4 


38 

different attempts to generate input, most of which are eventually found invalid. To mitigate 
this slow-down, we identify a class of expressions as likely to lead to counterexamples and 
prioritize their execution. Specifically, an expression whose innermost contract monitoring 
is of a first-order property on a concrete module is likely to reveal a bug. 7 In contrast, 
expressions in the middle of input generation do not have this form, because the inner¬ 
most contract monitoring isi on the opaque input source. Once the system successfully 
instantiates a concrete input and turns the program into this “suspect” form, it focuses on 
exploring this branch with that input instead of trying numerous other inputs in parallel. 
Using this simple heuristic, we are able to cut the execution time of a module violating the 
“braun-tree” invariant from non-terminating after 1 hour down to 2 seconds. 

Finally, there is a mismatch in the data-types between the solver’s data-type and Racket’s 
rich numeric tower. In particular, Racket supports mixed arithmetic between different types 
of numbers up to complex numbers (St-Amour et al. 2012), while Z3’s treatment of numbers 
resembles that from most statically typed languages, and the solver does not perform well in 
generating models involving a dynamic restriction of a number’s type. Below is an example 
where SCV fails to generate a counterexample: 

(f : integer? —> integer?) 

(define (f n) (/ 1 (+ 1 (* n n)))) 

In Racket, division is defined on the full numeric tower, and the result of (/ 1 (+ 1 (* n 
n))) may not be an integer. In the generated query, this result is an unknown number L of 
type Real, and the solver cannot give a model to a constraint set asserting “ (not ( is int 
L) )”. In addition, Racket distinguishes between exact and inexact numbers, where inexact 
numbers are floating point approximations. Because Z3 does not reason about floating 
points, we currently do not soundly model inexact arithmetic. 


5 Related work 

In this section, we relate our work to related strands of research: symbolic execution, 
random-testing, soft-typing, static contract verification, refinement types, and model check¬ 
ing of recursion schemes. 

Symbolic execution: Symbolic execution is the idea of running programs with abstract 
inputs. Symbolic execution on first-order programs is mature and has been used to find 
bugs in real-world programs (Cadar et al. 2006, 2008). Cadar et al. (2006) presents a 
symbolic execution engine for C that generates counterexamples of the form of mappings 
from addresses to bit-vectors. Later work extends the technique to generate comprehensive 
test cases that discover bugs in large programs interacting with the environment (Cadar 
et al. 2008). 


7 


In a symbolic program, the monitored value in this position is usually abstract and covers all values 
the module produces 
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Random Testing: Random testing is a lightweight technique for finding counterexamples 
to program specifications through randomly generated inputs. QuickCheck for Haskell (C1- 
aessen and Hughes 2000) proves the approach highly practical in finding bugs for functional 
programs. Later works extend random testing to improve code coverage and scale the tech¬ 
nique to more language features such as states and class systems. Heidegger and Thiemann 
(2010) use contracts to guide random testing for Javascript, allowing users to annotate 
inputs to combine different analyses for increasing the probability of hitting branches with 
highly constrained preconditions. Klein et al. (2010) also extend random testing to work 
on higher-order stateful programs, discovering many bugs in object-oriented programs 
in Racket. Seidel et al. (2015) use refinement types as generators for tests, significantly 
improving code coverage. 

Our approach is a complement to random testing. By combining symbolic execution with 
an SMT solver, the method takes advantage of conditions generated by ordinary program 
code and not just user-annotated contracts. In addition, the approach works well with highly 
constrained preconditions without further help from users. In contrast, random testing sys¬ 
tems typically require programmers to implement custom generators (Claessen and Hughes 
2000) or require user annotations to incorporate a specific analysis collecting all literals in 
the program to guide input construction (Heidegger and Thiemann 2010). Type-targeted 
testing (Seidel et al. 2015) is more lightweight and does not necessitate an extension to 
the existing semantics, but gives no guarantee about completeness, as inherent in random 
testing. Even though the tool rules out test cases that fail the pre-conditions, regular code 
and post-conditions do not help the test generation process. Our system makes use of both 
contracts and regular code to guide the execution to seek inputs that both satisfy pre¬ 
conditions and fail post-conditions. Exploring possible combination of symbolic execution 
and random testing for more efficient bug-finding in higher-order programs is our future 
work. 

Soft typing: Verifying the preconditions of primitive operations can be seen as a weak form 
of contract verification and soft typing is a well studied approach to this kind of verifica¬ 
tion (Cartwright and Felleisen 1996). There are two predominant approaches to soft-typing: 
one is based on a generalization of Hindley-Milner type inference (Cartwright and Fagan 
1991; Wright and Cartwright 1997; Aiken et al. 1994), which views an untyped program as 
being embedded in a typed one and attempts to safely eliminate coercions (Henglein 1994). 
The other is founded on set-based abstract interpretation of programs (Flanagan et al. 1996; 
Flanagan and Felleisen 1999). Both approaches have proved effective for statically check¬ 
ing preconditions of primitive operations, but the approach does not scale to checking pre- 
and post-conditions of arbitrary contracts. For example, Soft Scheme (C artwright and Fagan 
1991) is not path-sensitive and does not reason about arithmetic, thus it is unable to verify 
many of the occurrence-typing or higher-order recursion scheme examples considered in 
the evaluation. 

Contract verification: Following in the set-based analysis tradition of soft-typing, there 
has been work extending set-based analysis to languages with contracts (Meunier et al. 
2006). This work shares the overarching goal of this paper: to develop a static contract 
checking approach for components written in untyped languages with contracts. However 
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the work fails to capture the control-flow-based type reasoning essential to analyzing un¬ 
typed programs and is unsound (as discussed by Tobin-Hochstadt and Van Horn (2012)). 
Moreover, the set-based formulation is complex and difficult to extend to features consid¬ 
ered here. 

Our prior work (Tobin-Hochstadt and Van Horn 2012), as discussed in the introduction, 
also performs soft contract verification, but with far less sophistication and success. As our 
empirical results show, the contributions of this paper are required to tackle the arithmetic 
relations, flow-sensitive reasoning, and complex recursion found in our benchmarks. 

An alternative approach has been applied to statically checking contracts in Haskell and 
OCaml (Xu 2012; Xu et al. 2009), which is to inline monitors into a program following a 
transformation by Findler and Felleisen (2002) and then simplify the program, either using 
the compiler, or a specialized symbolic engine equipped with an SMT solver. The approach 
would be applicable to untyped languages except for the final step dubbed logicization, a 
type-based transformation of program expressions into first-order logic (FOL). A related 
approach used for Haskell is to use a denotational semantics that can be mapped into 
FOL, which is then model checked (Vytiniotis et al. 2013), but this approach is highly 
dependent on the type structure of a program. In contrast, our approach does not assume 
a type system to guide the verification process, and therefore verifies run-time type-safety 
in addition to richer contracts. Further, these approaches assume a different semantics for 
contract checking that monitors recursive calls. This allows the use of contracts as inductive 
hypotheses in recursive calls. In contrast, our approach can naturally take advantage of this 
stricter semantics of contract checking and type systems, but can also accommodate the 
more common and flexible checking policy. Additionally, our approach does not rely on 
type information, the lack of which makes these approaches inapplicable to many of our 
benchmarks. 

Contract verification in the setting of typed, first-order contracts is much more mature. 
A prominent example is the work on verifying C# contracts as part of the Code Contracts 
project (Fahndrich and Logozzo 2011). 

Refinement type checking: Refinement types are an alternative approach to statically 
verifying pre- and post-conditions in a higher-order functional language. There are several 
approaches to checking type refinements; one is to restrict the computational power of 
refinements so that checking is decidable at type-checking time (Freeman and Pfenning 
1991); another is to allow unrestricted refinements as in contracts, but to use a solver to 
attempt to discharge refinements (Knowles and Flanagan 2010; Rondon et al. 2008; Vazou 
et al. 2013). In the latter approach, when a refinement cannot be discharged, some systems 
opt to reject the program (Rondon et al. 2008; Vazou et al. 2013), while others such as 
hybrid type-checking residualize a run-time check to enforce the refinement (Knowles and 
Flanagan 2010), similar to the way soft-typing residualizes primitive pre-condition checks. 
Although the end result of our approach closely resembles that of hybrid type checking, we 
differ in a few important respects. First, we do not rely on an existing type system. Second, 
the method scales straightforwardly to first-class contracts, whereas existing refinement 
type systems allow user-defined predicates only for base types and no mechanism for a 
dynamically computed mix of flat and higher-order specifications. Third, symbolic ex¬ 
ecution ignores unreachable errors such as those under unreachable lambdas while type 
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checking eagerly checks all code. Finally, handling unknown functions on the semantics 
side instead of relying on the theory of uninterpreted functions introduces potentially fewer 
difficulties in scaling to effectful contracts, and allows straightforward generation of higher- 
order counterexamples. 8 

DJS (Chugh et al. 2012; Chugh et al. 2012) supports expressive refinement specification 
and verification for stateful JavaScript programs, including sophisticated dependent spec¬ 
ifications which SCV cannot verify. However, most dependent properties require heavy 
annotations. Moreover, null inhabits every object type. Thus the approach cannot give 
the same guarantees about programs such as reverse (§2.5) without significantly more 
annotation burden. Additionally, it relies on whole program annotation, type-checking, and 
analysis. 

Model checking higher-order recursion schemes: Much of the recent work on model 
checking of higher-order programs relies on the decidability of model checking trees gen¬ 
erated by higher-order recursion schemes (HORS) (Ong 2006). A HORS is essentially a 
program in the simply-typed A-calculus with recursion and finitely inhabited base types that 
generates (potentially infinite) trees. Program verification is accomplished by compiling a 
program to a HORS in which the generated tree represents program event sequences (Ko- 
bayashi 2009b; Kobayashi et al. 2010). This method is sound and complete for the simply 
typed A-calculus with recursion and finite base types, but the gap between this language 
and realistic languages is significant. Subsequently, an untyped variant of HORS has been 
developed (Tsukada and Kobayashi 2010), which has applications to languages with more 
advanced type systems, but despite the name it does not lead to a model checking procedure 
for the untyped A-calculus. A subclass of untyped HORS is the class of recursively typed 
recursion schemes, which has applications to typed object-oriented programs (Kobayashi 
and Igarashi 2013). In this setting, model checking is undecidable, but relatively complete 
with a certain recursive intersection type system (anything typable in this system can be 
verified). To cope with infinite data domains such as integers, counter-example guided 
abstraction refinement (CEGAR) techniques have been developed (Kobayashi et al. 2011). 
The complexity of model checking even for the simply typed case is n-EXPTIME hard 
(where n is the rank of the recursion scheme), but progress on decision procedures (Ko¬ 
bayashi and Ong 2009; Kobayashi 2009a) has lead to verification engines that can verify a 
number of “small but tricky higher-order functional programs in less than a second.” 

In comparison, the HORS approach can verify some specifications which SCV cannot, 
but in a simpler (typed) setting, whereas our lightweight method applies to richer languages. 
Our approach handles untyped higher-order programs with sophisticated language features 
and infinite data domains. Higher-order program invariants may be stated as behavioral 
contracts, while the HORS-based systems only support assertions on first order data. Our 
work is also able to verify programs with unknown external functions, not just unknown 
integer values, which is important for modular program verification, and we are able to 
verify many of the small but tricky programs considered in the HORS work. 


8 


Solvers such as Z3 and CVC4 do not support model generation for higher-order functions 
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6 Conclusions and perspective 

We have presented a lightweight method and prototype implememtation for static con¬ 
tract checking using a non-standard reduction semantics that is capable of verifying and 
falsifying higher-order modular programs with arbitrarily omitted components. Our tool, 
SCV, scales to realistic language features such as recursive data structures and modular 
programs, and verifies programs written in the idiomatic style of dynamic languages. The 
analysis proves the presence and absence of run-time errors without excessive reliance on 
programmer help. With zero annotation, SCV already helps programmers find unjustified 
usage of partial functions by showing concrete inputs that trigger those errors. With explicit 
contracts, programmers can enforce rich specifications to their programs and have the cor¬ 
rect ones optimized away without incurring the significant run-time overhead and incorrect 
ones quickly falsified with concrete test cases. 

While in this paper, we have addressed the problem of soft contract verification, the tech¬ 
nical tools we have introduced apply beyond this application. For example, a run of SCV 
can be seen as a modular program analysis—it soundly predicts which functions are called 
at any call site. Moreover it can be composed with whole-program analysis techniques to 
derive modular analyses (Van Florn and Might 2010). Adding temporal contracts (Disney 
et al. 2011) to our system would produce a model checker for higher-order languages. This 
breadth of application follows directly from the semantics-based nature of our approach. 
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A Proofs 

This section presents proofs for theorems in the paper. Lemmas 2 and 4 prove theorems 3. 
Other lemmas support these main ones. 

Theorem 1 (Soundness of Counterexamples) 

If (Ei, Ei) i —>* (blame)),, E„), E' C E„, and E( = T! (E\) then E[ \ — >* blame)),. 

Proof 

First, E' C E,- for any heap E,- on the trace (E\, Ei) i— >* (blame)),,, E„) (by lemma 6). Next, 
if (Ei, E,■) i— (E i+ 1 , E /+ i), and E( C (£), E'), then £' i— E' i+l such that E' i+l C (E i+ 1 , E') 
(by lemma 7). Therefore, any Hilly concrete instantiation of the final heap leads the program 
through the same execution trace. □ 

Theorem 2 (Relative Completeness of Counterexamples) 

If E\ i—>* blame)),, E\ C (E\, Ei), and there is a complete procedure for generating values 
satisfying first-order constraints, then (E\ , Ej) i—V (blame)),, E„) such that we can derive 
some E ' such that E ' C E„. 


Proof 
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The discovery of the error follows from soundness of reduction relation (lemma 2). We 
show that the instantiation of the final heap is relatively complete with respect to the un¬ 
derlying solver by induction on the size of E„. 

• If E„ = 0: There is E' = 0 such that 0C0. 

• If E„ = E„_ i [L K> V]: 9 Assume there is Y! n _ x such that K-l^n-V 

— If V = • c : All constraints in C are first-order by construction, which we can 
produce a model for by assumption. (In particular, if C, any concrete first-order 
value can instantiate the unknown value). 

— If V = A X.E: Then Y.' n = Y ! n _, [L i—> A X.E). By induction hypothesis, for each 
address L in £, H n _ x (L) properly instantiates E(£). 

— If V = n: The case is trivial. 

□ 

Theorem 3 (Verified modules can’t be blamed) 

If a module named H is verified in P, then for any concrete program Q for which P is an 
abstraction, eval(Q) $ blamed 

Lemma 1 (Soundness of abstract reduction relation ) 

If E\ (Ei, Ei) and E( \—» E' 2 , then (E\, Ej) i—>* (£ 2 , Eo) such that E' 2 (£ 2 , £ 2 ) 

for some X 2 D Ej. 

Proof 

By case analysis on the derivation of E( \—> E( and E(Q(Ei,Zi). 

• Case E( = (O V (), E\ = (OV\) and E( = A\ because S(0,O,V{) 3 (A[, 0): 

By soundness of S (lemma 3), (£ 1 , Ei) 1 —> (£ 2 , E 2 ) 3 E' 2 . 

• Case E( = if V'E' 2 E'p E\ = if VEiEf and£j 1 —> E' 2 because 5(0, zero?,V') 9 (0, 0) 

By soundness of 5 (lemma 3), 5(Ei, zero?, V) 9 (0, E 2 ), so (£ 1 , Ei) 1 —> (E 2 , Lf) □ 

F' 

£ 2- 

The other case of conditional is similar. 

• Case E( = (A X.E' V' x ) , E' 2 = [V'/X\E ', £1 = ( V f V x ): 

— Case V f = A X.E: then (£ 2 , E 2 ) = ([V x /X]E, Ei) □ E' 2 . 

— Case Vf = L, where Ei (£) = •: then E 2 = Ei [£ 1-9 A X.E] as in rul eApply-Unknown, 
and £' is of the restricted form approximated by £, so (£ 2 , [I f x /X]E) □ £). 

— Case Vf = £, where Ei(£) = A X.E: then (£ 2 , E 2 ) = ([V x /X]E, Ei) □ E 2 . 

• Case£( = mon^„ W (V',V'),Ei = mon^;, W ( V C ,V ): 

— Case 5(0,dep?,V’ c ') 9 (0, 0): By soundness of 5 (lemma 3), 5(Ei,dep?,V c ) 9 
(0, E 2 ). In addition, by soundness of the provability relation (lemma 4), either 
both E\ and £1 take shortcuts to the result, or both step to the contract checking 
form, or E( takes shortcuts and £1 steps to the contract-checking form, which by 
lemma 5 eventually steps to the result approximating E 2 . 

9 It is straightforward to see that the heap does not contain cycle, by case analysis on the last step of 
updating the heap in the reduction relation. 
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Other cases are straightforward. □ 

Lemma 2 (Soundness of summarization) 

The semantics with summarization using tables E and M is sound with respect to an ex¬ 
tension to the original semantics without these tables with trivial rules for rt and blur 
frames: 

( V > 1 -l V 

(blur^ ) V) i—» V 

If E[ C (Ei,Mi,5i) and E\ \—> E (, then (Si,Mi,Si) i —>* (Z 2 ,M 2 1 S 2 ) such that E 2 C 

(S 2 ,M 2 ,S 2 >. 

Proof 

By induction on the derivation of E\ C (Ei,Mi,Si) and case analysis on the reduction 
E[^Ef 

• Case E[ C (Ei,Mi,Si) because E[ C (E\, Ei) and (Ei, Ei) € Si: 

Case analysis on E\ \—^ Ef. 

— Sub-case: E[ = S'fX.E' V'] , E' 2 1 —» S'[(rt ^xx.E',v') [V'/X]E')] , and E x = 
S[XX.E V}: 

- If application (A X.EV) is new: (£i,Ei) /3-reducesto (£ 2 , £ 2 ), and 

(Si,Mi,Si U{(£ 2 , £ 2 }}) 
straightforwardly approximates E' 2 . 

- If application (A X.E V ) is a recursive call with a new argument: (Ei ,Mi, gi) 
p -reduces with a widened argument, which also straightforwardly approxi¬ 
mates E 2 . 

- If application (A X.E V) is a repeated recursive call: 

(Si,Mi,Si) 1 —» (E 2 ,Mi,S 2 ) 

where S 2 = Ei U [(Eo,AX.£, Vo) i-g (£,Ei,<£o,4)], and some S 2 2 Si. 
Moreover, we have S = S 0 [( rt (Zo,\x.E,v 0 } 4)], and 

S' = S'o[( rt^ x X E t S ' k )] 

so E 2 = S'o[( rt( 0 2 , X £ / y^ S'k[)rt^xx.E',v') [V'/X]E ')])]. 

Because {Sq[{ rt^ Q t xx.E,v 0 ) Sk[LX.E V ])], Ei) € S 2 , it follows from lemma 9 
that 

4[(rt (z 0 Xx.e,v 0 ) [^b /X]E)} € S 2 . 

Thus, S' 0 {(rfv M . E 'y>) [V'/X]E')} C (S 2 ,Mi,S 2 ). 

Hence, 4 0 [( rt (0 vj/> S' k {( [V'/X]E’)])\ C (E 2 ,Mi,S 2 ). 

— Other sub-cases are straightforward 

• Case£( C (Ei,Mi,Si) because: 

— E[ = S'o[{ rt^y/^} 4[(rt (Wl> £')])] 

— 4[(rt< 0 ,v',v') -£’ / )] E (“i,Mi,Si) 
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— E,(E 0 ,V,Vv)9(f,Si,4),4) 

— Vo E (V„ Ej); y| E (Vv, Ej) 

There are 2 subcases, whether E' is a value or can be decomposed into a context and 
redex. 

— lf£' is a value V' a \ 

This means 

E[ = 4o[( rt ( 0 ,v',v ( ;> 4*[( Vj)])] 

^2 = 4o[(rt^ 0 y/ y o ) 4*^])]. 

By lemma 10, there exists (4[( rt {z 0> v,y c ) Vo)]) Ei) € Si such that 

4o[( ft {m.v'yf) y' a )\ E (4[( Kr)], Ei). 

Then (Si,Mi,Si) 1 —S> (S 2 ,M 2 ,S 2 ) such that S 2 9 (4[( rtfo.vw*) 4^])], Ei), 
which approximates Ef 
— If E\ =S’\[E”\. 

We have 4 0 [(rt S'\[E'{])]y —■> 4 0 [( rt <0 v , y ^ S’i [£"])]• 

By induction hypothesis, (Si,Mi,Si) 1 —¥* (S 2 ,M 2 ,S 2 ), such that 

4o[(rt <0) y, >y/) 4,^])] E (S 2 ,M 2 ,S 2 ). 

Because S 2 E Ei, 4 0 [(rt <0 y/ Vo) E (S 2 ,M 2 ,S 2 ) follows. 

□ 

Lemma 3 (Soundness of primitive operations) 

If E' E E ‘ (E, Ei), V' E 1 ' 1 (V^, Ei) and S(0,0,vf) 9 (A', 0) then S(Ei,0,vt) 9 (A, E 2 ) 
such that A E 1 " 2 (A, E 2 ) and E' E 5 " 2 (E, E 2 ) for some E(, 9 E|. 

Proof 

By inspection of cases of O and V' E (V\ Ej) and consistency of the provability relation 
(lemma 4). □ 

Lemma 4 (Consistency of provability relation) 

If V' E (y, E) and V' c E (W, E) then: 

• If 0 h V ': y c ' / then either E h y : W / or E h y : V c ? 

• If 0 h y': y c ' X then either E h V : y c X or E h V : y c ? 

• If 0 h V ': y c ' ? then E h y : y c ? 

Proof 

By inspection of cases of V' E (E, E) and V' c E {V c , E). □ 

Lemma 5 (Soundness of provability relation) 

If y' E <y, El), y c ' E <V c ,E i), 0 h y': Y c / andEi h y : y c ? then ((V c y ), El) .—►* (V fl ,E 2 ) 
such that 5(E 2 , zero?,y a ) 9 ( 0 , E 3 ). 


Proof 
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By cases of 0 b V ': V' c / (where V' is concrete) and V' c C (V c , Ei). □ 

Lemma 6 

If E' IZ Et and (E \, Ei) i—> (£ 2 , E 2 ) then E' C Ei. 

Proof 

By case analysis of (£ 1 , E|) 1 —> (Ei, Eo). The prior heap is either a restriction of E 2 , or 
has the same domain, mapping some addresses to more abstract values than E 2 . □ 

Lemma 7 (Completeness of refinement ) 

If (£ 1 , Ei) 1 —> (£ 2 , E 2 ), E' C Ei, E' C E 2 , and E( C (£ 1 , E'), then £( 1 —> E' 2 such that 
£' C (£2, S'). 

Proof 

By case analysis on the reduction step. For each case, the reduction leaves enough refine¬ 
ment on the heap to steer all instantiations to the same path. The case on primitve operations 
is deferred to lemma 8 □ 

Lemma 8 (Completeness of primitive operations) 

If <5(Ei,0, V?) 9 (A,E 2 ),E' CE,, E'CE 2 , and V' C ($, E'), then 8(<b,0,V') 9 (A', 0) 
such that A' C (A, E 2 ). 

Proof 

By inspection of cases of 8. □ 

Lemma 9 

If (0,0,{(£, 0)}) 1 — (Z,M,S) and £[(rt {E y f y x) £')] e S, then S[(V f V x )] € 5. 

Proof 

By induction on (0,0,{(£, 0)}) 1 —>* (; Z,M,S ). 

• Case (0,0, {(£, 0)}) = (Z,M,S): We assume programmers cannot write expressions 
of the form ( rt/^ym £). The case holds trivally. 

• Case <0,0, {<£, 0)}) 1 —>* ( Z',M',S ') and ( Z',M',S ') 1 —( Z,M,S ): Case analysis 
on the reduction ( Z',M',S'). If the reduction introduces a new frame ( rt^ VfVx ) £) 
in S, it must have resulted from the application (Vf V x ) in S'. 

□ 

Lemma 10 

If (<?[( rt (£0)V/iVi) V)], E) C (Z,M,S) where f S\ [( r ’ t <_,y /) _> <&)] for any S\, £ 2 , then 
there is g <E S such that (£[( rt (Zoy f y x ) V)], E) C g 

Proof 

By case analysis on the derivation 

(£[(rt {Lo y f y x) V)},L)r(Z,M,S) 

(only the base case of • is applicable). □ 
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Program | Lines 

Checks 

Correct Variant (ms) 

Incorrect Variant (ms) 


ex-01 

6 

4 

3.3 

32.4 

C/3 

CD 

ex-02 

6 

8 

3.9 

29.4 

£ 

ex-03 

10 

12 

22.0 

57.8 

o3 

X 

ex-04 

11 

12 

7.8 

41.4 

w 

bfi 

ex-05 

6 

6 

4.7 

31.4 

_C3 

"3-. 

ex-06 

8 

11 

5.1 

32.5 

£ 

ex-07 

8 

7 

4.7 

34.5 

CD 

Q 

ex-08 

6 

11 

7.0 

47.2 

S 

g 

ex-09 

14 

12 

8.6 

32.1 

B 

o 

ex-10 

6 

8 

3.5 

30.5 

o 

O 

ex-11 

9 

8 

6.7 

33.3 


ex-12 

5 

11 

5.7 

31.3 


ex-13 

9 

11 

7.5 

34.6 


ex-14 

12 

20 

8.2 

34.4 


Total 

116 

141 

98.7 

502.8 


Table B 1. Logical types for untyped languages benchmarks 



Program 

Lines 

Checks 

Correct Variant (ms) | Incorrect Variant (ms) 


append 

8 

15 

22.7 

6.4 


cpstak 

23 

15 

12,449.6 

46.0 


flatten 

12 

24 

27.2 

37.5 

CD 

& 

last-pair 

7 

9 

21.1 

30.6 

1 

last 

17 

21 

35.7 

19.0 

X 

w 

length-acc 

10 

14 

26.6 

8.0 

tn 

c 

length 

8 

13 

22.7 

6.7 

I 

member 

8 

15 

23.3 

34.9 

<+H 

rec-div2 

9 

17 

22.5 

36.1 

o 

CO 

sub st* 

11 

12 

23.1 

34.1 


tak 

12 

14 

22.7 

36.8 


taut 

9 

8 

22.2 

34.9 


Total 

134 

177 

12,719.4 

331.0 


Table B 2. Soft typing benchmarks 


B Detailed evaluation results 

This section shows detailed evaluation results for benchmarks collected from different 
verification papers. All are done on a Core i7 @ 2.70GHz laptop running Ubuntu 13.10 
64bit. Analysis times are averaged over 10 runs. 



ZU064-05-FPR paper-jfp 20 March 2016 10:4 


52 



Program 

Lines 

Checks 

Correct Variant (ms) | Incorrect Variant (ms) 


intro 1 

13 

11 

26.6 

208.5 


intro2 

13 

11 

27.7 

210.2 


intro3 

13 

12 

30.6 

48.0 


sum 

9 

12 

100.4 

200.4 


mult 

9 

20 

188.6 

221.2 


max 

14 

11 

35.7 

220.2 

'Eh 

mc91 

8 

15 

169.6(1) 

115.5 

£ 

ack 

9 

16 

15.8 

205.5 

X 

W 

repeat 

11 

11 

10.1 

39.7 

(L> 

g 

fhnhn 

18 

15 

38.6 

64.4 

CD 

43 

fold-div 

18 

34 

289.0 

250.2 

o 

GO 

hrec 

9 

13 

21.8 

214.1 

S3 

neg 

20 

15 

95.4 

255.4 

H 

1-zipmap 

16 

31 

483.0(1) 

152.9 

o 

<D 

hors 

25 

17 

56.8 

58.4 

s-< 

0> 

r-lock 

17 

19 

75.3 

90.1 

■g 

o 

r-file 

50 

62 

84.3 

118.7 

(-H 

d> 

reverse 

11 

28 

20.6 

288.5 

43 

W> 

isnil 

9 

17 

14.9 

9.6 

2 

mem 

12 

28 

28.2 

545.0 


nthO 

15 

27 

24.2 

806.6 


zip 

14 

42 

268.1 

688.2 


a-max 

18 

33 

528 

294.1 


fold-fun-list 

20 

32 

70.5 

543.2 


fold-left 

14 

27 

2028.4(1) 

145.5 


fold-right 

14 

27 

2468.5 (1) 

184.1 


forall-leq 

13 

23 

21 

335 


hannonic 

14 

26 

381.6 

101.2 


length 

13 

24 

14.5(1) 

104.4 


map-filter 

21 

51 

2083.1 (1) 

399.3 


risers 

26 

61 

38.9 

267.3 


search 

14 

26 

2386.5 (1) 

28.1 


zip-unzip 

27 

62 

2064.4(1) 

758.8 

Total 527 859 14,190.7(8) 

Table B 3. Higher-order model checking benchmarks 

8,172.3 



ZU064-05-FPR paper-jfp 20 March 2016 10:4 


Higher-order symbolic execution for contract verification and refutation 53 



Program 

Lines 

Checks 

Correct Variant 

Incorrect Variant 


boolflip 

10 

17 

10.5 

38.8 

t-j 

mult-all 

10 

18 

9.2 

532.8 

D 

mult-cps 

12 

20 

348.1 

52.3 

& 

mult 

10 

17 

102.9 

36.9 

r -1 

Dh 

sum-acm 

10 

15 

41.1 

1,132.3 

CD 

Q 

sum-all 

9 

15 

8.9 

442.1 


sum 

8 

14 

9.0 

35.5 


Total 

69 

116 

529.7 

2,270.7 


Table B 4. Dependent type checking benchmarks 


| Program 

Lines 

Checks 

Correct Variant 

Incorrect Variant 


all 

9 

16 

23.0 

23.2 


even-odd 

10 

11 

102.7 

20.3 


factorial-acc 

10 

9 

16.3 

7.0 


factorial 

7 

8 

13.1 

5.9 


fibonacci 

7 

11 

1,345.7 

97.3 


filter-sat-all 

11 

18 

2,053.3 (1) 

23.1 


filter 

11 

17 

24.5 

37.6 

C/3 

foldll 

9 

17 

22.0 

22.2 

U 

'S. 

foldl 

8 

10 

22.6 

22.4 

s 

C3 

foldrl 

9 

11 

22.3 

21.1 

X 

w 

foldr 

8 

10 

27.0 

23.1 

SU 

ho-opaque 

10 

14 

19.1 

19.9 

"S 

o 

id-dependent 

8 

3 

4.5 

17.6 

d> 

X 

insertion-sort 

14 

30 

57.6 

54.9 

W 

o 

map-foldr 

11 

20 

24.2 

24 

o 

mappend 

11 

31 

29.7 

26.9 

1 

map 

10 

13 

23.9 

23.7 

CO 

recip-contract 

7 

9 

4.4 

4.1 


recip 

8 

15 

5.7 

5.5 


rsa 

14 

5 

17.7 

25.1 


sat-7 

20 

12 

5647.7 

101.8 


sum-filter 

11 

18 

25.0 

27.1 


web (22) 

158 

213 

268.6 

- 


web (110) 

778 

1,336 

- 

14,126.9 (5) 


| Total 

1,159 

1,857 

9800.6(1) 

14,760.7 (5) 

CZ) 

o 

snake 

164 

246 

38,602.3 

3,034.2 

a 

tetris 

267 

338 

12,303.5 

2,255.0 

o 

zombie 

249 

476 

21,276.2 

1,152.0 


| Total 

680 

1,060 

72,182.0 

6,441.2 


Table B 5. Higher-order symbolic execution benchmarks 
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